Gas station without pumps

2014 October 25

Grading based on a fixed “precent correct” scale is nonsense

Filed under: Uncategorized — gasstationwithoutpumps @ 10:12
Tags: , , , , , ,

On the hs2coll@yahoogroups.com mailing list for parents home-schooling high schoolers to prepare for college, parents occasionally discuss grading standards.  One parent commented that grading scales can vary a lot, with the example of an edX course in which 80% or higher was an A, while they were used to scales like those reported by Wikipedia, which gives

The most common grading scales for normal courses and honors/Advanced Placement courses are as follows:

“Normal” courses Honors/AP courses
Grade Percentage GPA Percentage GPA
A 90–100 3.67–4.00 93–100 4.5–5.0
B 80–89 2.67–3.33 85-92 3.5–4.49
C 70–79 1.67–2.33 77-84 2.5–3.49
D 60–69 1.0–1.33 70-76 2.0–2.49
E / F 0–59 0.0–0.99 0–69 0.0–1.99
​Because exams, quizzes, and homework assignments can vary in difficulty, there is no reason to suppose that 85% on one assessment has any meaningful relationship to 85% on another assessment.  At one extreme we have driving exams, which are often set up so that 85% right is barely passing—people are expected to get close to 100%.  At the other extreme, we have math competitions: the AMC 12 math exams have a median score around 63 out of 150, and the AMC 10 exams have 58 out of 150.  Getting 85% of the total points on the AMC 12 puts you in better than the top 1% of test takers.  (AMC statistics from http://amc-reg.maa.org/reports/generalreports.aspx ) The Putnam math prize exam is even tougher—the median score is often 0 or 1 out of 120, with top scores in the range 90 to 120. (Putnam statistics from  http://www.d.umn.edu/~jgallian/putnam.pdf) The point of the math competitions is to make meaningful distinctions among the top 1–5% of test takers in a relatively short time, so questions that the majority of test takers can answer are just time wasters.
I’ve never seen the point of having a fixed percentage correct ​used institution-wide for setting grades—the only point of such a standard is to tell teachers how hard to make their test questions.  Saying that 90% or 95% should represent an A merely says that tests questions must be easy enough that top students don’t have to work hard, and that distinctions among top students must be buried in the test-measurement noise.  Putting the pass level at 70% means that most of the test questions are being used to distinguish between different levels of failure, rather than different levels of success. My own quizzes and exams are intended to have a mean around 50% of possible points, with a wide spread to maximize the amount of information I get about student performance at all levels of performance, but I tend to err on the side of making the exams a little too tough (35% mean) rather than much too easy (85% mean), so I generally learn more about the top half of the class than the bottom half.
I’m ok with knowing more about the top half than the bottom half, but my exams also have a different problem: too often the distribution of results is bimodal, with a high correlation between the points earned on different questions. The questions are all measuring the same thing, which is good for measuring overall achievement, but which is not very useful for diagnosing what things individual students have learned or not learned.  This result is not very surprising, since I’m not interested in whether students know specific factoids, but in whether they can pull together the knowledge that they have to solve new problems.  Those who have developed that skill often can show it on many rather different problems, and those who haven’t struggle on any new problem.

Lior Pachter, in his blog post Time to end letter grades, points out that different faculty members have very different understandings of what letter grades mean, resulting in noticeably different distributions of grades for their classes. He looked at very large classes, where one would not expect enormous differences in the abilities of students from one class to another, so large differences in grading distributions are more likely due to differences in the meaning of the grades than in differences between the cohorts of students. He suggests that there be some sort of normalization applied, so that raw scores are translated in a professor- and course-specific way to a common scale that has a uniform meaning.  (That may be possible for large classes that are repeatedly taught, but is unlikely to work well in small courses, where year-to-year differences in student cohorts can be huge—I get large year-to-year variance in my intro grad class of about 20 students, with the top of the class some years being only at the performance level of  the median in other years.)  His approach at least recognizes that the raw scores themselves are meaningless out of context, unlike people who insist on “90% or better is an A”.

 People who design large exams professionally generally have training in psychometrics (or should, anyway).  Currently, the most popular approach to designing exams that need to be taken by many people is item-response theory (IRT), in which each question gets a number of parameters expressing how difficult the question is and (for the most common 3-parameter model) how good it is at distinguishing high-scoring from low-scoring people and how much to correct for guessing.  Fitting the 3-parameter model for each question on a test requires a lot of data (certainly more than could be gathered in any of my classes), but provides a lot of information about the usefulness of a question for different purposes.  Exams for go/no-go decisions, like driving exams, should have questions that are concentrated in difficulty near the decision threshold, and that distinguish well between those above and below the threshold.  Exams for ranking large numbers of people with no single threshold (like SAT exams for college admissions in many different colleges) should have questions whose difficulty is spread out over the range of thresholds.  IRT can be used for tuning a test (discarding questions that are too difficult, too easy, or that don’t distinguish well between high-performing and low-performing students), as well as for normalizing results to be on a uniform scale despite differences in question difficulty.  With enough data, IRT can be used to get uniform scale results from tests in which individuals don’t all get presented the same questions (as long as there is enough overlap in questions that the difficulty of the questions can be calibrated fairly), which permits adaptive testing that takes less testing time to get to the same level of precision.  Unfortunately, the model fitting for IRT is somewhat sensitive to outliers in the data, so very large sample sizes are needed for meaningful fitting, which means that IRT is not a particularly useful tool for classroom tests, though it is invaluable for large exams like the SAT and GRE.
The bottom line for me is that the conventional grading scales used in many schools (with 85% as a B, for example) are uninterpretable nonsense, that do nothing to convey useful information to teachers, students, parents, or any one else.  Without a solid understanding of the difficulty of a given assessment, the scores on it mean almost nothing.

2014 April 11

Arthur Benjamin: Teach statistics before calculus!

I rarely have the patience to sit through a video of a TED talk—like advertisements, I rarely find them worth the time they consume. I can read a transcript of the talk in 1/4 the time, and not be distracted by the facial tics and awkward gestures of the speaker. I was pointed to one TED talk (with about 1.3 million views since Feb 2009) recently that has a message I agree with: Arthur Benjamin: Teach statistics before calculus!

The message is a simple one, though it takes him 3 minutes to make:calculus is the wrong summit for k–12 math to be aiming at.

Calculus is a great subject for scientists, engineers, and economists—one of the most fundamental branches of mathematics—but most people never use it. It would be far more valuable to have universal literacy in probability and statistics, and leave calculus to the 20% of the population who might actually use it someday.  I agree with Arthur Benjamin completely—and this is spoken as someone who was a math major and who learned calculus about 30 years before learning statistics.

Of course, to do probability and statistics well at an advanced level, one does need integral calculus, even measure theory, but the basics of probability and statistics can be taught with counting and summing in discrete spaces, and that is the level at which statistics should be taught in high schools.  (Arthur Benjamin alludes to this continuous vs. discrete math distinction in his talk, but he misleadingly implies that probability and statistics is a branch of discrete math, rather than that it can be learned in either discrete or continuous contexts.)

If I could overhaul math education at the high school level, I would make it go something like

  1. algebra
  2. logic, proofs, and combinatorics (as in applied discrete math)
  3. statistics
  4. geometry, trigonometry, and complex numbers
  5. calculus

The STEM students would get all 5 subjects, at least by the freshman year of college, and the non-STEM students would top with statistics or trigonometry, depending on their level of interest in math.  I could even see an argument for putting statistics before logic and proof, though I think it is easier to reason about uncertainty after you have a firm foundation in reasoning without uncertainty.

I made a comment along these lines in response to the blog post by Jason Dyer that pointed me to the TED talk. In response, Robert Hansen suggested a different, more conventional order:

  1. algebra
  2. combinatorics and statistics
  3. logic, proofs and geometry
  4. advanced algebra, trigonometry
  5. calculus

It is common to put combinatorics and statistics together, but that results in confusion on students’ part, because too many of the probability examples are then uniform distribution counting problems. It is useful to have some combinatorics before statistics (so that counting problems are possible examples), but mixing the two makes it less likely that non-uniform probability (which is what the real world mainly has) will be properly developed. We don’t need more people thinking that if there are only two possibilities that they must be equally likely!

I’ve also always felt that putting proofs together with geometry does damage to both. Analytic geometry is much more useful nowadays than Euclidean-style proofs, so I’d rather put geometry with trigonometry and complex numbers, and leave proof techniques and logic to an algebraic domain.

2014 January 28

High school senior workload

Filed under: home school — gasstationwithoutpumps @ 22:00
Tags: , ,

Shireen Dadmehr, in Math Teacher Mambo: Seniors, Workload, Responsibility, talks about high-school seniors dropping out of her CS course:

The flip side of this is that depending on what courses the students take their senior year, and how they approach their scholarship applications and college applications, and what sports they play, and how far their daily commute is, the amount of time they have for homework and sleep and life fluctuates. Needless to say, there are stressed out little bundles of walking sleepless zombies.

Her dilemma is whether to push the students to stick with what they signed up for or to let them drop courses that are either too ambitious for them or just lower priority in an over-loaded schedule. My advice to her is simple—talk to the students and make sure that they aren’t just scared of the material or not getting it due to things that could be changed in the teaching, but let them make their own decisions about whether the effort is bringing enough reward.

As a home-schooling parent of a high-school senior, I’ve also got to deal with helping my son balance his workload—I have high standards and high expectations for him, and I’d like to have him do everything he started this year, but some things are taking more time than anticipated, so it is time to re-evaluate the importance of different activities.

The second semester is starting now, so this is the ideal time to rebalance the workload.

His first semester was supposed to be econ, AP chem, writing (a mix of college essays and tech writing),  group theory, two computer science/computer engineering projects (the light gloves and upgrades to the Arduino data logger), and theater.  The mix of what was done was not quite what was envisioned—more of some things and less of others.

The AP chem and econ are pretty much where they should be (a few days behind in chem due to illness, about a week to go to finish econ) at the semester break.

The college essays turned out to take far more time and effort than originally anticipated (by me and him—his mother had a more realistic view), which pushed the tech writing out.  The college essays got done, but only at the expense of no winter break. Luckily this problem was seen early enough that the transcript was revised to describe his fall semester English class more accurately.

More acting got done than originally expected (and we originally expected a heavy acting load), as he ended up with 5 roles (rather than 1 or 2) in the school one-acts and he added a 3-day workshop during winter break.  This last month has been crazy, with performances almost every weekend, but things should wind down after the Dinosaur Prom Improv performance this Sunday, to just 2 theater classes a week (Dinosaur Prom and Much Ado About Nothing).  The two going away are the Page-to-Stage theater class (where he was the oldest student) to make room for younger kids on the waiting list for the spring production, and the school production which is once a year and had the performance last weekend.  With 2 theater groups instead of 4, he should have more time for other classes.

He’s been working diligently (often spending more time than he really has available) on the light gloves.  In addition to hours of programming or hardware design he’s been meeting weekly or more with the team and having Skype/Google Hangout meetings with investors and with new, remote engineers thinking of joining the team.  I think that they’re getting their second set of prototypes fabricated this month, and he’ll need to start intensive programming to get the Bluetooth LE chip working and communicating with a laptop or cell phone.  I believe that they are getting 10 prototype boards assembled commercially, rather than having to deal with the surface mount parts themselves—the price was low enough that the time savings (and probable quality) justified the extra price.  I think that they are still planning to have a Kickstarter campaign this spring and go into production over the summer, but that schedule may slip if the programming and debugging takes longer than they expect.  (Note: I’m not involved in this project, except once in a while as someone to bounce ideas off of, so all those “I think” and “I believe” statements are vague impressions not the result of detailed status reports.)

The Arduino Data Logger project was put on hold over the fall, but I really need him to get back to that very soon, as I’ll have to tell the staff whether to order Arduino boards or KL25Z (or KL26Z) boards for my circuits class soon.  The extra features that I requested are not critical, but we can’t use the KL25Z boards unless they are supported by the data logger software, and it would be nice to have the much higher resolution and sampling rate of the KL25Z boards.

Because group theory was always last on the priority list and had only Dad-imposed deadlines, it has lost out.  We’re still in Chapter 3, and I don’t see much hope of our catching up by the end of the year.  I see three choices:

  • Drop Group Theory from the transcript, treating as a nice idea that there just wasn’t time for.
  • Push real hard to try to complete the book anyway—I don’t see this happening, as the group theory is just fun math, not something he “needs”, so it I want it to be something we do together for fun, not under high stress.  It is the lowest priority of our courses in my mind, so I can’t see pushing him hard on it.
  • Reduce our ambition to only ½ or even ⅓ of the book, and reduce the credits for the course correspondingly.

He’s finished with econ (almost) and with college essays, but is picking up government/civics and dramatic literature.

The civics will probably be at about the same effort level as the econ, but it may be hard to find a good source at the right level—the MIT open courseware econ lectures made a nice, rather lightweight econ course, supplemented by a few popular-press econ books.  Government/civics doesn’t have the mathematical appeal of econ, and most high-school or freshman college books on the subject are rather dry and boring.  Oh, well, that course is my wife’s responsibility (as was the econ), and she can undoubtedly do a better job of finding materials for it that I could.

The dramatic literature course is preparation for the trip to the Oregon Shakespeare Festival and should not be a problem for him—certainly nothing like the torture of college-application essays!  He’s done two previous dramatic literature courses with the same teacher (different plays each year, based of the OSF schedule), and knows what to expect.

So this spring semester should see

  • replacement of econ by civics
  • replacement of college-application essays by dramatic literature
  • continuation of AP Chem
  • switch from mostly hardware and app/web programming to embedded software for light gloves
  • two-fold reduction in acting (down to a sane level)
  • addition of data logger project
  • low-level continuation or elimination of  group theory

I think that the spring is likely to be less stressful than the fall, but still a full load.

We’ll have to make a decision on the group theory soon, for the updated report through the Common App.

2013 September 23

Science Fair Workshop

Filed under: home school,Science fair — gasstationwithoutpumps @ 20:45
Tags: , , ,

Suki Wessling, my son, and I ran science fair workshop last week for middle school and high school home-schooled students.  Our attendance was meager (one student other than our two sons).  So that the effort we put into the handout will not be wasted, I’ll put it in this blog post.  The next time I do a handout for science fair, I’ll want to add a section on doing engineering projects also, since those have a somewhat different process than the simplified version of the “scientific method” that we described.

The remainder of this post is the handout:

 

Science Fair Workshop for Parents

Why science fairs?

The science fair is a lot of work. However, it is also a very rewarding project to do with your child. Benefits include

  • Helping your child do a project that has a beginning, middle, and end. This can be very useful for children who tend to be scattered and unfocused.

  • Completing a cross-discipline project, including science, math, language arts, and public speaking.

  • Supporting your child to approach more challenging work.

  • Meeting other families who love science.

The Scientific Method

The scientific method:

  • Is the basis of science

  • Is the opposite of having a belief and finding a justification for it

  • Is not weakened when hypotheses are disproven

The steps of the scientific method are

  1. Observe

  2. Form an investigative question

  3. Read what others have written, and make competing models that explain the observation

  4. Come up with a hypothesis (a prediction that is different in the competing models, not a guess)

  5. Conduct experiments

  6. Accept or reject hypothesis

An example of the scientific method in action:

  1. Observe that a plant in the shady part of your garden didn’t grow well.

  2. Why didn’t that plant grow as well as plants you put in at the same time in the sunny part of the garden?

  3. Read about what plants need to grow, noting that different plants need different amounts of sun and water.

  4. Hypothesis: This plant needs a certain amount of sunlight per day to grow well.

  5. Plant a good number of seedlings (6–8) and subject half of them to sunny conditions, half to shady conditions. Keep a notebook of the plants’ progress, with observations and measurements.

  6. Consider whether the data support the hypothesis.

How to find a project

There are many places to look to find a good project:

  • The best projects grow out of a child’s actual interest.

  • The best projects take advantage of what children like to do (e.g., messy projects, outdoor projects, math-based projects).

  • Try out examples on a science fair project website just for ideas, then try to expand on or change them based on your child’s interests.

  • Don’t just replicate the steps of a project outlined on the web!

Tips for getting through the process

  1. Plan early: Get all the dates on your calendar, and make sure your child has enough time to do all the steps (including writing the report).

  2. Don’t bite off too much: If your child’s idea is too BIG, help him whittle it down to size. Don’t be tempted to finish it off if the child resists finishing—this is also part of the learning process.

  3. Plan to be completely done well before your school’s science fair (if you’re taking part in one).

  4. There is nothing wrong with preparation: successful kids do actually practice their spiels. However, don’t overprep your child so that she seems to be reciting something you wrote. Make sure she understands what she’s talking about and only uses words she really understands.

What do judges look for?

See more details from Kevin: http://tinyurl.com/7n8r3yv

  • Multiple replication of the experiments—generally the more the better, but 3 is usually a minimum.  More replication is generally better than more different conditions.

  • Proper controls (both positive and negative, when possible)

  • Graphical display of the results with correctly labeled axes and no chart junk

  • Correct use of units of measurement

  • Proper (simple) statistics (averages, best fit straight lines, …) High school students may add standard deviation and significance tests (chi-square or Student’s T)

  • Measuring the right thing

  • Measuring and reporting inputs as well as outputs

  • Lab notebook with detailed information recorded as the experiment is done

  • Clever use of simple equipment

  • Careful thought about how the experiment could be improved if it were to be repeated

Homeschoolers and the SC Science Fair

  • Students doing projects involving invertebrate or vertebrate animals, human subjects, recombinant DNA, tissue, pathogenic agents, or controlled substances, need to get approval from their sponsoring teacher before they begin their research. A Certificate of Compliance Form must be signed by both student and sponsoring teacher, then submitted by the registration deadline. (The detailed rules have not been published yet for this year—they will be in the “Science Fair Guide”.)

  • Put the schedule on your calendar, including the awards night.

  • If your homeschool program takes part, make sure your teacher meets the school roster deadline.

  • If you are independent or your program doesn’t take part, fill out the registration form and choose your school if it’s in the list. If not, put your private school’s name in the Other box. Submit a school roster after you register.

Winning and losing

Although it’s a competition, the SC Science Fair does a great job of making all the kids feel like they have achieved something. It’s always good to focus more on the event itself—setting up the display, talking to judges, and looking at other kids’ work—than talking about the prizes.

Resources

2013 July 1

How can we get more programming taught in high schools?

In the comments on Mark Guzdial’s post Why AP CS:Principles is a good thing: Responding to Gas Station without Pumps (which is a response to my post Millions for a fairly useless new test), an interesting question has arisen: What should a CS teacher know?

I commented

I agree that figuring out what content an intro CS teacher needs to know is important, both in depth and in breadth. If we set the bar too high, there will be no CS teachers in public schools (essentially the current situation). If we set the bar too low, no CS will be taught and we’ll have to undo the damage once the students get to college.

CS as a field is still struggling with how to teach beginners (it is pretty clear that some students learn, but it is not clear to me how much this correlates with what teachers do—but that’s your [that is, Mark Guzdial's] area of expertise, not mine).

Defining the core competencies that a beginning instructor of beginning students needs seems to me quite difficult. I suppose it starts with deciding what the students need to learn, then figuring out what the teacher needs to be able to do to get them there. I further suppose that this is the intent of the CS Principles course—figuring out the minimal set of essential skills we want out of a first course.

Garth commented

In most schools the CS teacher will also be teaching something else; Math, Science, Art or whatever so the requirements have to be realistic. I think CS Ed would almost have to be a minor, there are just not enough jobs out there yet for a teacher with only a CS Ed major.

So Garth has been thinking of it in terms of new teachers only, it seems. I suspect that we’d get more CS teachers more quickly by summer training for existing math and physics teachers than by trying to train new teachers in ed schools.

Physics teachers could be attracted to programming by a computational modeling curriculum, like the one used in the Matter and Interactions textbook.  Vpython provides a fairly simple entry point for physics teachers and students to write simulations of the sorts useful for AP Physics (both C: Mechanics and B).  I think that a computation-based text for Physics B still needs to be written, as Matter and Interactions definitely requires calculus after the first couple of weeks (or is there already an algebra-based physics book using something like Vpython?).  Once physics teachers become proficient in Vpython, it is not a big stretch for them to teach the CS Principles course (they could even continue to use Vpython for it).

Math teachers could be attracted to programming by summer workshops based around Project Euler, which provides a series of math challenges to be solved by programming (currently 434 such challenges).  Providing them with instruction in a suitable programming language (Python is a good choice for Project Euler) so that they can tackle the math problems would give them the experience programming needed before they would consider teaching programming.  Teaching them to program in Geogebra, free software for doing geometry and algebra presentations and apps, would also be valuable—both for improving their programming skills and for improving their current math courses.

The key point of both these ideas is that we could attract physics and math teachers to programming in order to become better teachers in their current fields.  That they would also become competent to teach beginning CS courses is a bonus.  Even if this approach failed to produce any new CS courses, we would still have improved physics and math teaching.  Given how addictive programming is, I think that we would also find these teachers becoming a force within their schools for creating programming courses, avoiding the current Catch-22: that there are no CS courses because there are no CS teachers, and no CS teachers because there are no courses.

I’ve not addressed in this post the initial question from the comments: what do beginning CS teachers need to know?  One implication of my proposal is that CS teachers need to be able to program.  They don’t need to be fantastically good programmers, nor do they need to know many different programming languages, but they need to be able to program and debug in the language of instruction.  They need to be able to model debugging, and they need to be able to assist students who are stuck (without taking over for the student).  They need to have personally done every assignment they assign, to figure out any ambiguities in the wording of the assignment and to make sure that it is doable with the tools and techniques that the students have been given in their class.

I think that may be enough—I don’t think that beginning CS teachers need to know software development techniques and the intricacies of the development environments and libraries beyond what is essential for the assignments.  They may choose to learn more (some math teachers might enjoy asymptotic analysis of algorithms, for example, and some physics teachers might get into programming robots), but it isn’t necessary for teaching an intro course.

I don’t think we can set the bar any lower: a teacher who can’t program can’t teach programming effectively, and programming should be at the heart of any intro CS course.

Next Page »

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 276 other followers

%d bloggers like this: