# Gas station without pumps

## 2013 April 29

### AP tests as validation of courses

Filed under: home school — gasstationwithoutpumps @ 09:07
Tags: , , ,

I’m on a couple of the Advanced Placement teacher mailing lists (Physics because I’ve been home-schooling my son in calculus-based physics, biology because I’ve been attempting to get bioinformatics into AP bio courses as a teaching tool).  On one of the lists, a fairly new teacher brought up a concern about grading—last year many of his or her top students got 1s on the corresponding AP exam.  This triggered an excellent discussion about the meaning of grades and the value of the AP test.  In this blog post I’m going to repeat my contributions to the discussion, lightly edited to remove any identifying info, with brief summaries of other views to show what I was responding to (often lumping several people’s ideas together).

Speaking as a college professor, one of the main values of the AP exams is providing a uniform external calibration for the level of high school classes.  Most high school teachers don’t have that much communication with other teachers, particularly not on matters like what level of performance should be expected of students at different levels.  The result has been an enormous grade inflation over the past few decades, so that “A” is the most common grade in many schools rather than a rare accolade.  (The problem is common in colleges also, perhaps even worse than in high schools, especially in the humanities.)

Having an external calibration (an A in this course is roughly the same as a 5 on the AP exam) is very useful for gauging the level of the course, for the teacher, for the students, for parents of the students, and for the colleges that might admit the student.  The AP exam scores, after all, are supposedly set to correlate with the grades the students would have gotten in a first-year course in college.  It used to be that 5, 4, and 3 correlated well with grades of A, B, and C, but grade inflation in the colleges appears to have advanced faster than on the APs, so perhaps 5 and high 4s correspond to an A, and low 4s and 3s to a B (depending on the college, of course, as grade inflation is far from uniform).

Of course, the AP test does not measure all the things that go on in an AP course, and a student can do well in the course and poorly on the exam or vice versa, but if A students in a course are consistently getting 1s on the test, it leads one to suspect that grade inflation has happened.  Similarly if C students are routinely getting 5s, one suspects that the course grading is ridiculously harsh.

Others pointed out the obvious thing, that the test is a 3-hour snapshot of how a student did on an arbitrary subset of the material on one day.  Seniors who have already been admitted to college (or decided not to attend) may have little incentive to do well, particularly if their college does not give credit for AP exams.    One teacher, who posts a lot of good stuff on the mailing list, asked me directly:

I am curious to know what portion of the grades in the courses you teach are determined by one, largely multiple choice exam?  I’ve never taken a worthwhile course where “passing” was ever determined in such a way.

He called it right on that one.  I’ve never given a multiple-choice exam in 31 years of being a professor—multiple-choice exams are very difficult to write well, and really only appropriate when there are enormous numbers of test takers to reduce the cost and variance of grading and amortize the large cost of making the exam.  For that matter, I give very few exams—most of my courses are graded on the basis of week-long or quarter-long projects, papers, and programs.  I’m not particularly interested in the things testable by multiple-choice tests (mainly memory and simple reasoning tasks), but in what students can do with a sustained effort.  My most recent course (Applied Circuits for Bioengineers) was graded mainly on the basis of their weekly design reports based on their lab work (about 5 pages of writing a week, and any mistakes in the schematics or explanations meant that they had to redo the writing).

I’m not going to defend the AP exams as great ways to evaluate learning, but they are better than the exams that most teachers write and rely on for grading, and they do have the advantage of uniformity across a large number of classrooms.  A 5 on an AP exam may not tell me a lot about a student’s capabilities, but I believe it tells me more than an A from teacher I’ve never met and who may have only taught a handful of AP students.

I agree that the goal of an AP course is not “college credit” or even “preparation for the AP exam,” but the learning that takes place.  But grades and  exams scores are used for selecting kids for college admission (as being better than a simple lottery or selection based solely on money or race), so it is better if the exams and grades are as meaningful as they can be made (at reasonable cost—a lot of state testing is providing very little useful data at enormously high cost in both money and lost time).  Because teachers have so little opportunity to calibrate their own grading, an external test at the right level provides very useful information.

I agree that a single sample of small number of students may not tell you much about the level of instruction in a class, but may be a warning that recalibration is needed. There are many possible reasons for the discrepancy (difference in content between course and exam, difference in level of expectations, student test-taking ability or attitude, random noise, …).  For a course labeled “AP”, the teacher has a responsibility to make sure that the content of the exam is covered in the course.  As a parent, I would also want the level of expectations in the course to be as high as in a first-year college course.  If the students are uniformly doing worse on the test than what the teacher expects, then some reflection on why the expectations are wrong is needed.

Elsewhere in the discussion, another teacher asked

When admissions officers are trying to select <6% of the students from 38,828 applicants (as Stanford did this year), it is difficult to process voluminous communications from individual teachers.  They rely on summary statistics (GPA and SAT scores, for example) to do crude filtering, then concentrate on student essays and letters of recommendation.  The very selective schools sometimes try to correct the GPAs based on grade inflation at the high school (there are databases of information about each high school being sold—I’ve no idea how accurate the information in the databases is, but some admissions offices use them).

The college faculty, who might care about the “skills that aren’t reportable by traditional methods”, are rarely part of the admissions decisions.

Public universities are usually forced to have a simple formula for most of their admissions, to be able to show the public that they are being scrupulously fair.  If they took into account “skills that aren’t reportable by traditional methods”, parents of rejected students would scream to their legislators to cut off funding to the university.  (An exception is always made for athletics, which is a sacred cow in the US.)

Later in the discussion, after more narrative transcripts were proposed and the value college admissions officers put on letters of recommendation had been introduced, I wrote about my experience with narrative evaluations.

I had to review narrative transcripts for honors review of graduating seniors, and I often found it very difficult to interpret the narratives.  It took a long time to read a narrative transcript, and often told me very little about the student. There was no controlled vocabulary, and the same word might be used by one instructor for a barely passing performance and by another for a truly excellent performance.  I could see why med schools could be frustrated by the difficulty of dealing with this format.  Nowadays the honors review in the school of engineering is based on GPA, with a well-defined grey region where student research projects can make a difference in the honors rating.  I understand that the honors review now takes only a fraction of the time it used to, despite large increases in the number of students reviewed, and the time is spent only on the cases where some thought is needed.

As a grad director, I read a lot of applicant files for admission to our program.  GRE scores and GPA do matter, but not very much, as the GRE tests essentially the same stuff as the SAT (nothing college level on the general GRE and there isn’t a subject GRE in our field) and college GPAs are often highly inflated, depending on the college.  We also get a lot of foreign applicants, whose grades come on a bewildering variety of different scales that are essentially uninterpretable.  We expect high GRE scores of all applicants, but decide between them based mainly on their personal statements and letters of recommendation. What we are looking for is strong evidence that the students can do research (not just coursework), and the best evidence is that the student has already done substantial research.  Many of our grad students come to us with multiple publications already—more than I had when I got my first faculty position.  Narrative transcripts would not be a good substitute for letters of recommendation from faculty that had supervised research—the signal we’re looking for would be buried in the noise of irrelevant comments for coursework that is not that important to us.

As a homeschooling parent of a high school junior who is likely to fit in best at a super-selective college like Harvey Mudd, MIT, or Stanford, I worry a lot about how to put together his transcript, school profile, and counselor letter to show that he really would fit in. There are mailing lists with 1000s of readers dedicated to parents worrying about how to get their home-schooled children into appropriate colleges (hs2coll@yahoogroups.com, for example).  I’m having to rely heavily on external validation of his coursework, much of which is not even accredited.  SAT 2 and AP exams form part of that validation, while college and university courses form another part.  Science fairs and contest exams (AMC12 and AIME in math, F=mA in physics) form yet another part, though his contest scores are not stellar enough to make him a shoo-in at the colleges where he would fit, since he does all his exams without prep. We are trying to get letters of recommendation from the university faculty (he’s been at the top of the classes), since those will be particularly informative for admissions officers.

I’ve written blog posts about how homeschoolers can get into the University of California, which is highly bureaucratic, but fairly straightforward:

https://gasstationwithoutpumps.wordpress.com/2011/10/07/satisfying-ucs-a-g-requirements-with-home-school/

https://gasstationwithoutpumps.wordpress.com/2012/06/14/ways-in-to-university-of-california/