Gas station without pumps

2010 October 23

Experience Points for classes

Filed under: Uncategorized — gasstationwithoutpumps @ 22:51
Tags: , ,

Teachers frequently bemoan the fact that students don’t seem to be interested in learning, but just in getting points. Teachers try to find ways to make grading schemes more meaningful, so that students will care more about learning. Currently fashionable is Standards-Based Grading, which is good for a reductionist analysis of topics, but not so strong on synthesis. SBG also has trouble measuring sustained performance.

Lee Sheldon, at Indiana University, has taken the opposite track and embraced point chasing.  He doesn’t give grades—he gives experience points. Students earn points for quizzes and for both individual and group projects.  I don’t know whether he has different levels in the class, with a certain number of experience points needed to unlock the next level of learning.  If I used an XP-based grading system, I’d certainly structure the class that way!

Experience points might be a good way to encode a rubric, with different XP values for different core tasks and bonus tasks.  Assignments would have to be structured so that essential material has to be done (to at least some minimal standard) before any bonus XP can be earned, so that students don’t get lots of bonus points for bells and whistles if they don’t have the core ideas solid. XP could also be awarded for unanticipated student achievements, which is difficult to do in percent-based grading systems.

Katrin Becker discussed the system on her blog, but seems to feel that it is essential that

students know that they are not penalized for not doing extra, and can be assured that they can still earn an ‘A’ on an assignment by doing a good job of the problem as specified.

Personally, I feel that doing a good job on the problem as specified is B work, and that only by going above and beyond the minimal specs can one earn an A.  If I were to use an XP-based system, the grade for the course would be based on the total XP earned, and merely “good” work would end up being a B.

Of course, this is the total opposite of SBG, which divides up topics and does not allow strength in one topic to compensate for weakness in another.  I think, though, that XP matches the student mindset better and is more likely to motivate students to put in the extra work needed to really learn material. (I know that SBG proponents hate extra credit, but I rather like the idea.)

2010 October 3

Just scoring points

Filed under: Uncategorized — gasstationwithoutpumps @ 00:01
Tags: , , ,

I recently re-read a marvelous 4-page essay that uses metaphors to explain why students and professors have such different expectations of student learning.  The essay is Just Scoring Points by Walter R. Tschinkel from The Chronicle Review, The Chronicle of Higher Education April 13, 2007.

He first presents the metaphor that seems to fit what politicians believe when they require lots of testing to standards: that of pouring knowledge into empty vessels.  Both students and professors reject this metaphor as inappropriate.

Next he presents another cliché: constructing a building by laying a foundation, then adding one brick after another, gradually layering on knowledge to form a coherent whole.  Students were generally happy with this metaphor (as are many teachers, given its ubiquity), but Tschinkel points out that student behavior does not fit this metaphor at all.  If students really were building an edifice of knowledge, they would retain more from exam to exam and from course to course.  One doesn’t build a building by letting half the bricks evaporate every 6 months.

Instead, Tschinkel proposes that student behavior is better explained with a sports metaphor.  The exams correspond to important games.  Students spend a lot of effort figuring out how to win the game and maximize their scores, but once the game is over, nothing is left but the score.  At the end of their education, all they have is the GPA: their cumulative score from several seasons.  Any knowledge or skills they retain from their education is purely incidental.

Tschinkel points out that this behavior is an adaptive response to the way they are graded and rewarded for being in school.  Their funding continues as long as they pass enough exams, whether or not they retain anything from their education.  He suggests that lectures, multiple-choice exams, and curricula consisting of essentially independent courses all contribute to the problem.

Some of his solutions are (as he recognizes) unlikely to happen, because of the price. Non-lecture instruction is generally much more expensive than lectures, as are hand-graded essays rather than machine-scored multiple-choice tests.  More integrated curricula are certainly possible. Even now, some programs (especially engineering ones) have much more interlocked courses than most programs, so that students who do not retain sufficient material from one course to the next fail (engineering professors are also much more willing to fail students who don’t know required material than professors in other fields are).

Going back to a recurring theme on this blog, what does Tshinkel’s metaphor suggest about standards-based grading (SBG) and sustained performance? The SBG approach to grading seems to be based on the fill-a-bucket model, with many buckets to fill, but the assumption that once filled they stay that way. There is nothing inherent in SBG that moves students away from gathering points rather than knowledge: they just have to gather the points in many different buckets. They can concentrate on one bucket at a time, then forget it to move on to the next. Since SBG is based on students demonstrating mastery once or twice, but not at particular times, it does not quite fit with the “game” metaphor, though.  From a sports metaphor, it is more like recording your personal best, but not necessarily retaining the ability to duplicate the feat.

The question remains open: how do we get students to value the knowledge and skills they acquire in classes enough to retain them for future use?

2010 August 29

Sustained performance and standards-based grading

Filed under: Uncategorized — gasstationwithoutpumps @ 09:16
Tags: , ,

I have been looking at the latest fad sweeping education, standards-based grading (SBG), and trying to see if it something I should incorporate in my own grading practices.

My first post on SBG looked at some of the assumptions and guiding principles of SBG, concluding that it looked like a good idea if you took a reductionist view of education, where you could split your objectives into separately assessable standards.

My second post on SBG looked at the unspoken assumption that assessment is cheap, something that is not the case in many of my classes.

Another problem I have with SBG is that for a lot of standards, the goal is sustained performance, not one-shot success. It isn’t enough to get a comma right once—you have to get have to get almost every comma right, every time you write. Similarly, it isn’t enough to use evidence from primary and secondary sources and cite them correctly once—you have to do it in every research paper you write.

If you forget to include the “sustained performance” or “automaticity” (to use a buzzword that elementary math teachers seem fond of) components of the standards, you get a sloppy implementation that reinforces the do-it-once-and-forget-it phenomenon that makes students unable to do more advanced work.

SBG aficionados believe in instantaneous noise-free measures of achievement.  If a student takes a long time before they “get it”, but then demonstrate mastery, that’s fine.  This results in the practice of replacing grades for a standard with the most recent one.  I think that is ok, as long as the standard keeps being assessed, but if you stop assessing a standard as soon as students have gotten a good enough score (which seems to be the usual way to handle it), then you have recorded their peak performance, not the best estimate of their current mastery.  Think about the fluctuations in stock prices:  the high for the year is rarely a good estimate of the current price, even if the prices have been generally going up.

If you want to measure sustained performance, you must assess the same standard repeatedly over the time scale for which you want the performance sustained (or as close as you can come, given the duration of the course and the opportunity costs of assessment).  The much-derided average is intended precisely for this purpose: to get an accurate estimate of the sustained performance of a skill.

SBG tries to measure whether students have mastered each of a number of standards, under the assumption that mastery is essentially a step function (or, at least, a non-decreasing function of time). Under this assumption, the maximum skill ever shown in an assessment is a good estimate of their current skill level. There is substantial anecdotal evidence that this is a bad assumption: students cram and forget.  Indeed, the biggest complaint of university faculty is that students often seem to have learned nothing from their prerequisite courses.

Conventional average-score grading makes a very different assumption: that mastery is essentially a constant (that students learn nothing).  While cynics may view this as a more realistic assumption, it does make measuring learning difficult.  One of the main advantages of averages is that they reduce the random noise from the assessments, but at the cost of removing any signal that does vary with time.

The approach used in financial analysis is the moving-window average, in which averages are taken over fixed-duration intervals.  This smooths out the noisy fluctuations without eliminating the time-dependent variation.  (There are better smoothing kernels than the rectangular window, but the rectangular window is adequate for many purposes.)  If you look at a student’s transcript, you get something like this information, with windows of about a semester length.  Each individual course grade may be making the assumption that student mastery was roughly constant for the duration of the course, but upward and downward trends are observable over time.

Can SBG be modified to measure sustained performance?  Certainly the notion of having many separate standards that are individually tracked is orthogonal to the latest-assessment/average-assessment decision.  Even student-initiated reassessment, which seems to be a cornerstone of SBG practice, is separate from the latest/average decision, though students are more likely to ask for reassessment if it will move their grade a lot, and the stakes are higher with a “latest” record. Student-initiated reassessment introduces a bias into the measurement, as noise that introduces a negative fluctuation triggers a reassessment, but noise that introduces a positive fluctuation does not.

Perhaps a hybrid approach, in which every standard is assessed many times and the most recent n assessments (for some n>1) for each standard are averaged, would allow measuring sustained performance without the assumption that it is constant over the duration of the course.  If the last few assessments in the average are scheduled, teacher-initiated assessments, not triggered by low scores, then the bias of reassessing only the low-scorers is reduced.

2010 July 3

more on SBG

Filed under: Uncategorized — gasstationwithoutpumps @ 10:09
Tags: , , ,

One of the first presentations on Standards-Based Grading that I read was  Dan Meyer’s post, which seems to have started many others as well. I’ve also been reading Sean Cornally’s blog Think Thank Thunk, who is passionately committed to SBG, and who is passionately opposed to including homework in assessment.

While I like some of the ideas of SBG, I’m still having trouble applying it to my own teaching.

The think I like most about SBG is that it requires the teacher to articulate precisely what they want the students to learn—much more precisely than state standards or textbook designers need to. This is a valuable, but very difficult task. The hard part is not listing topics, but coming up with meaningful assessments that test whether the student has mastered the concept.

I can easily list topics that students need to have some understanding of, and I can create assignments which show mastery of several of the crucial skills and ideas, but I have a hard time pinpointing exactly what I want students to know in a testable way.  For example, one topic I teach is hidden Markov models (HMMs), which are a major tool in bioinformatics for protein and DNA sequence analysis.  (For RNA sequence analysis, HMMs don’t capture enough of the information, but understanding HMMs helps with understanding Stochastic Context-Free Grammars, which are useful).  So I know I need to present HMMs, give the students ways of thinking about them, and walk them through the derivation of the forward-backward algorithm. But what can I assess them on?  There isn’t time in the course for them implement the forward-backward algorithm and test it (there’s barely time for them to do the much simpler dynamic programming of the Smith-Waterman algorithm for sequence-sequence alignment). What smaller assessment can I give that is not completely bogus?

One assumption of SBG is that assessment is cheap.  That is, that it does not take up too much student or teacher time, and that reassessment of a concept can be done easily.  For some subjects (like arithmetic and algebra), there are readily available test generators that can create new instances of closely related problems at the push of a button.  For those subjects, assessment is indeed cheap.  But for other subjects, it may take weeks to craft a reasonable assessment, making reassessing difficult.

Just last fall, I redid the 6 programming assignments in my core bioinformatics class (each of which had been tweaked several times already).  The change was prompted by a change in the programming language the assignments were to be implemented in, from Perl to Python.  In redoing the assignments, I implemented solutions for each assignment, tweaked the assignment to make it better at probing student understanding of the concepts, and reimplemented the solutions.  I also standardized the I/O specifications, so that I could more easily evaluate the student programs, by comparing their output to the output of my programs automatically, on input that the students had not seen.  This effort, which did not involve creating new assignments, just tweaking 6 existing ones, took 2–3 weeks full time.  These assignments are expected to take students about 10 hours each—much of the tweaking was to try to keep the assignments form getting too big.  Creating a completely new assignment, which I do about every other year for this course, takes about 40 hours of effort.  So at 40 hours of teacher time and 10 hours of student time for a new assessment, reassessing in this class is not cheap.

Part of my problem is that I’m not interested in whether the students know factoids about biology, programming, statistics, or bioinformatics.  I don’t really care whether they can apply formulas to isolated problems or emulate a simple algorithm on a toy problem.  What I want them to do is to put together all the material they have learned in other classes and create programs that use statistics to answer biological problems.  I’m interested in their synthesis of the pieces, not a reductionist analysis of whether they have acquired the pieces separately.

This seems to me the biggest weakness of SBG: it is based on a reductionist approach to learning that is proper in the beginning stages of learning a subject, but which is not appropriate in later stages where the focus is on synthesis of the concepts.  Even in classes like Algebra 2, which lend themselves well to listing specific topics and techniques students must master, and for which assessments can be cheaply produced, how do teachers assess the ability of students to put the concepts together?  SBG seems pretty good for ensuring that students have all the tools in their toolbox, but how do we teach students how to choose the right tools? How do we assess their ability to solve a problem using multiple tools, choosing the best tools and applying them correctly?  (Note: although I don’t teach high-school math, I have coached middle-school math teams and may coach a high-school one next year.  For competition math, learning to choose among an array of tools and apply them in clever ways is the core pedagogic goal.)

I’ve yet to come up with any small assessments that tell me useful things about the students’ ability to synthesize concepts.  I’ve pretty much given up on giving tests in my courses, relying instead on programming assignments, week-long essay assignments, and quarter-long research projects.  Anything shorter just doesn’t seem to measure what I’m interested in.  But these big assignments invariably cover multiple skills, and don’t lend themselves to the easy diagnostics of  SBG.

2010 June 28

Standards-Based Grading

Filed under: Uncategorized — gasstationwithoutpumps @ 08:48
Tags: , , ,

Several of the blogs I read have been discussing Standards-Based Grading (abbreviated SBG) lately.  Perhaps the strongest proponent of this approach as been Mr. Cornally in his blog Think Thank Thunk.

The key ideas of this pedagogic approach seem to be

  • Break your course down into the small, testable ideas: the “standards” you will base your grading on.
  • Tie every assessment to one or two of the standards, so you know exactly what the assessment is measuring.
  • Give the students access to evaluation separately for each standard, so they can see where they need to improve.
  • Allow students to reassess on any standard where they are not satisfied with their performance.

The standards passed down from the state for school teachers are far too vague and far too broad to be usable for SBG, so teachers have to come up with their own.  This is probably the main strength and the main weakness of SBG: teachers structure their courses around what they want the students to be able to do, but they get almost no help from textbooks, curriculum committees, and colleagues in designing their courses.

SBG is based on assessments that test only one thing at a time, which is marginally possible in a reductionist view of math, but almost impossible in other subjects.  I’ve thought about applying the ideas to my college courses, which are mainly senior and graduate courses, but I’ve had no success in coming up with assessments.  For example, one topic I teach is Markov chains: I want students to understand them well enough to be able to write a computer program to train a Markov model from data, then use the model to analyze other data.  The assignment I’ve come up with to do this depends on several programming and writing skills (program design, familiarity with the language used, documentation skills, debugging skills, …), in addition to their understanding of Markov chains.  I know no way to separate these skills to test them independently, and weakness in any of the skills can make it difficult to determine ability in the others. Many of the skills I want most for students to acquire are inherently intertwined with other skills.  Being able to program a computer requires algorithm design, data structure design, coding, documenting, and debugging, none of which are easily isolated from the rest, and I’m not interested in the skills in isolation, but in their combination, since I’m teaching a class that assumes the programming skills are already there, rather than teaching them.

SBG requires that students be frequently informed of their progress in each standard.  Since I have had hard time creating assessments that look at only one thing, I have a hard time informing students of what they do and don’t understand.

Reassessment is the aspect of SBG that I have used for years—I allow students to redo any assignment if they are not satisfied with it.  I do hold students to a higher standard on a redo, though.  It is not enough to fix small bugs or grammatical problems that I have pointed out to them—they must show better understanding of the underlying concepts, generally improved writing, or improved programming skills.   This is not quite the reassessment that SBG expects, but SBG assumes that it is easy to generate multiple independent tests for any standard, and that an assessment requires very little time of either the teacher or the student, so that reassessment is cheap.  When the skills to be assessed are complex, a single assessment may take 10–30 hours of teacher time to create and 10–20 hours of student time to do, making reassessment a fairly costly endeavor.

In short, while Standards-Based Grading seems like a good way to structure a course, I’m not convinced that it is feasible for any of the courses I teach, and is probably only applicable to a small fraction of elementary and secondary-school courses.

« Previous Page

%d bloggers like this: