Gas station without pumps

2013 July 21

Automatically grading programming homework poorly

Filed under: Uncategorized — gasstationwithoutpumps @ 11:39
Tags: , , , , ,

Mark Guzdial’s post Automatically grading programming homework: Echoes of Proust pointed me to an MIT press release, Automatically grading programming homework, which starts with the claim

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), working with a colleague at Microsoft Research, have developed a new software system that can automatically identify errors in students’ programming assignments and recommend corrections.

While such a debugging aid may be useful for finding and correcting common errors in small programs (such as those found in beginning programming courses), it does not address what I see as one of the main goals of hand-grading student programs: evaluating how well their programs are structured and documented. I often spend as much time grading the comments, decomposition into procedures or methods, data structures chosen, error-handling, and variable names as the algorithmic details—none of which are addressed by this grading scheme.

In a comment on Guzdial’s post, Edward Bujak wrote,

Proust automatic grading caught 85% of the semantic errors since the domain (the specific program) was specified. In a short 6-month consultant position at ETS, we replicated this with the automated grading of the APCS free response questions. This was when the APCS test was in Pascal. The program prompt was known so an expert System (ES) was trained for that problem which would query an Abstract Syntax Tree (AST) dynamically constructed from the students submitted program. We graded on good and bad. The grading was reliable, consistent, and outperformed humans. It never saw the light of day.

Given that the AP CS problems are tiny coding problems that are not checked for the things I look for in hand grading anyway, it seems like an ideal application of automated grading of programs.  The syntax parser would have to be very forgiving (as able to recover from missing semicolons or mistyped variables as a human grader) to grade fairly.

Of course, the AP exams are handwritten, not typed, and OCR is still too unreliable for grading hand-written exams.  The data entry to enable automatic grading of AP CS exams probably exceeds the cost and error rate of hand grading the exams, so it is no wonder that the expert system Bujak worked on never saw the light of day.  Perhaps someday the AP exams will be done with keyboard entry, but the extra opportunities that introduces for cheating make it unlikely to be adopted any time soon.

I suspect that an adequate automatic grader for CS1 problems is possible (if you ignore comments, programming style, variable names, and other important things that CS1 should teach), by combining the generic automatic debugging approaches MIT is using with the problem-specific expert systems of Proust and whatever Bujack worked on for ETS.  The effort may be useful for making MOOCs a little less awful at grading, though it would not help with other problems with the pedagogic approaches of mass instruction.


1 Comment »

  1. Yeah I can see your concern about automatic grading. However, as a math teacher, I find a lot of tools that use OCR perfectly, especially grading. For example, I’m now using a tool called ClassroomIQ (, recommended by a colleague of mine. It’s a very efficient grading tool. It helps me grade homework and exams more quickly and easily. It’s a very handy and convenient product to have. Ever after I can get back the grades the day or even the hour after, my students are more active and engaged. Anyway, nice analysis.

    Comment by Joe — 2013 July 22 @ 09:34 | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at

%d bloggers like this: