# Gas station without pumps

## 2015 June 19

### 2015 AP Exam Score Distributions

Filed under: Uncategorized — gasstationwithoutpumps @ 21:36
Tags: , , ,

Once again this year, I’m posting a pointer to 2015 AP Exam Score Distributions:

Total Registration has compiled the following scores from Tweets that the College Board’s head of AP, Trevor Packer, has been making during June. These are preliminary breakdowns that may change slightly as late exams are scored.

I don’t know why I provide this free advertising for Total Registration, as I have no connection with the company, and do not endorse their services.  If the College Board would collect Trevor’s comments themselves, I’d point that page.  The main interest in AP result distributions comes in May, when students are taking the tests, and July when the students get the results.

The official score distributions (still from 2014 as of this posting—new results don’t go up until the Fall) from the College board are at https://apscore.collegeboard.org/scores/about-ap-scores/score-distributions, at least until the College Board scrambles their web site again, which they do every couple of years, breaking all external links.  They post a separate PDF file for each exam, which makes comparison between exams more difficult (deliberately, I believe, since inter-exam comparison is not really a meaningful thing to do).  It is also difficult to get good historical data on how the exam scores have changed over time—College Board probably has it on their website somewhere, but finding stuff in their morass is not easy.

My most popular post this year was once again How many AP courses are too many?, with about 19 views per day.  (It has also come in second over the lifetime of the blog, behind 2011 AP Exam Score Distribution.) The question of how many AP courses seems to come up both in the fall, when students are choosing their schedules, and in the spring, when students are overwhelmed by how many AP courses they took.

There aren’t many exams graded yet (only 11 on the Total Registration site), so I don’t have much to say about the results.  I probably won’t be looking at the exam scores much this year, since my son is no longer eligible to take AP exams, having graduated from high school. I might look at some of the statistics for the AP computer science exam, as I have some interest in seeing whether there are any changes in the number of test takers.  The interesting results (about gender and geography) won’t come out until the fall reports.

## 2014 June 21

### 2014 AP Exam Score Distributions

Once again this year, I’m posting a pointer to 2014 AP Exam Score Distributions:

Total Registration has compiled the following scores from Tweets that the College Board’s head of AP, Trevor Packer, has been making during June. These are preliminary breakdowns that may change slightly as late exams are scored.

Disclaimer: I have no connection with the company Total Registration, and do not endorse their services.  If the College Board would collect Trevor’s comment themselves, I’d point that page.  The main interest in AP result distributions comes in May, when students are taking the tests, and July when the students get the results.

The official score distributions (still from 2013 as of this posting) from the College board are at https://apscore.collegeboard.org/scores/about-ap-scores/score-distributions, at least until the College Board scrambles their web site again, which they do every couple of years, breaking all external links.  They post a separate PDF file for each exam, which makes comparison between exams more difficult (deliberately, I believe, since inter-exam comparison is not really a meaningful thing to do).  It is also difficult to get good historical data on how the exam scores have changed over time—College Board probably has it on their website somewhere, but finding stuff in their morass is not easy.

Views for my 2011 AP distribution post show the May and July spikes. This has been my most-viewed blog post, which is a bit embarrassing, since it has little original content.

My 2013 AP distribution post has not been as popular, probably because of search engine placement at Google.

My most popular post this year was How many AP courses are too many?, with about 10 views per day.  (It has also come in third over the lifetime of the blog, behind 2011 AP Exam Score Distribution and Installing gnuplot—a nightmare.) The question of how many AP courses seems to come up both in the fall, when students are choosing their schedules, and in the spring, when students are overwhelmed by how many AP courses they took.

The one AP exam my son took this year was AP Chemistry, for which only 10.1% got a 5 this year and about 53% pass (3, 4, or 5). We won’t have his score for a while yet, so we’re keeping our fingers crossed for a 5.  He finished all the free-response questions, so he’s got a good shot at it.

The Computer Science A exam saw an increase of 33% in test takers, with about a 61% pass rate (3, 4, or 5). The exams scores were heavily bimodal, with peaks at scores of 4 and at 1.  I wonder whether the new AP CS courses that Google funded contributed more to the 4s or to the 1s. I also wonder whether the scores clustered by schools, with some schools doing a decent job of teaching Java syntax (most of what the AP CS exam covers, so far as I can tell) and some doing a terrible job, or whether the bimodal distribution is happening within classes also.  I suspect clustering by school is more prevalent. The bimodal distribution of scores was there in 2011, 2012, and 2013 also, so is not a new phenomenon.  (Calculus BC sees a similar bimodal distribution in past years—the 2014 distribution is not available yet.) Update 2014 July 13: all score distributions are now available, and Calculus BC is indeed very bimodal with 48.3% 5s, 16.8% 4s, 16.4% 3s, 5.2% 2s, then back up to 13.3% 1s. Calculus AB has a somewhat flatter distribution, but the same basic shape: 24.3% 5s, 16.7% 4s, 17.7% 3s, 10.8% 2s, and 30.5% 1s. Overall calculus scores are up this year.  The 30.5% 1s on Calculus AB indicates that a lot of unprepared students are taking that test.  Is this the “AP-for-everyone” meme’s fault?

Physics B scores were way down this year, and Physics C scores way up—maybe the good students are getting the message that if you want to go into physical sciences, calculus-based physics is much more valuable than algebra-based physics. I expect that the algebra-based physics scores will go up a bit next year when they roll out Physics 1 and Physics 2 in place of Physics B, but that the number of students taking the Physics 2 exam will drop a lot.  I don’t expect a big change in the number of Physics C exam takers—schools that are offering calculus-based physics will not be changing their offerings much just because the College Board wants to have more low-level exams.

AP Biology is still  seeing the nearly normal distribution of scores, with 6.5% 5s and 8.8% 1s, so there hasn’t been a return to the flatter distribution of scores seen before the 2013 test change.

As always, the “easy” AP exams see much poorer average scores than the “hard” ones, showing that self-selection of who takes the exams is much more effective for the harder exams. When College Board and the high-school rating systems push schools to offer AP, the schools generally start by offering the “easy” courses, and push students who are not prepared to take the exams.  As long as we have stupid ratings that look only at how many students are taking the exams, rather than at how many are passing, we’ll see large numbers of failed exams.

## 2014 January 17

### CS commenters need to learn statistics

There was a recent report about how many students were taking AP CS exams, breaking out the information by gender, race, and state, which has been released in a few different forms.  Mark Guzdial’s blog post provides pointers to the data collected by Barbara Ericson.  Some of the comments provided on that post shows an appalling lack of statistical reasoning (like comparing states by subtracting percentages of different things).

So what are the interesting questions to ask of the data and how should they be handled statistically?

Most of the “gee-whiz” statements are about how few people in some group or other took (or passed) the AP CS exam:

• No females took the exam in Mississippi, Montana, and Wyoming.
• 11 states had no Black students take the exam: Alaska, Idaho, Kansas, Maine, Mississippi, Montana, Nebraska, New Mexico, North Dakota, Utah, and Wyoming.

Some people pointed out that some of these numbers may not be more than a small sample effect (no one took the exam in Wyoming, so having zero female test takers is not surprising).  How can we best state that a number is interesting?

Generally , this is done by creating a null model—one that computes the probability of different outcomes based on everything except the hypothesis being tested.  Then you look at how surprising the observed outcome is given the null model.   Exactly how the null model is constructed is crucial, as all that the statistical tests tell you is how badly your null model fits the data.

What sort of mathematical model should we be using for assigning probabilities to numbers of test takers (or numbers passing the test)?  One convenient one is a binomial distribution.  The binomial distributions are  a family of distributions over non-negative integers with two parameters N and p.  They are good for modeling the count of a number of independent events each of which occurs with some fixed probability.  If we think of each high school student in a state as having some (small) probability of taking the exam, then the number of exam takers can be modelled as a binomial distribution whose N value is the number of students and p the probability that each one takes the exam.  When N is large (as it would be for the number of high school students in a state) and Np is reasonably large, then the binomial distribution can be approximated by a normal distribution with mean Np and variance Np(1-p), but an even better approximation is to use the Poisson distribution with mean Np, which is what I’ll use here. The probability of zero test takers: $P(0)= \binom{n}{0} p^0 (1-p)^{n-0} = (1-p)^n \approx e^{-np}$.

So all we need to set the parameters of our null model is an expected number of test takers based on everything except what we wanted to test.  For example, if we wanted to test whether black test takers were under-represented in Maine, we would need a model that predicted how many black students would take the test, perhaps using the probability that students in Maine would take the test independent of race and the fraction of students in Maine that are black.  For Maine, there were 161 test takers, and 0 black test takers.  I don’t know the racial mix of high school students in Maine, but Wikipedia gives the black fraction of the whole state population as 1.03%.  Thus the expected number of black test takers is 1.658, and we can use $e^{-1.658}$ as the probability of seeing zero black test takers by chance.

UPDATE: 2014 Feb 1.  Some values in the following table corrected, due to clerical errors in copying from spreadsheet (I’m not sure which I hate worse, spreadsheets or HTML tables—they’re both awful formats).

state # test takers state % black expected black test takers under-rep p<
Idaho 6 47  0.95%  0.086 0.447  0.92 0.64
Kansas 12 47  6.15%  0.738 2.891  0.48 0.056
Maine  161  1.03%  1.658  0.19
Mississippi 2 1  37.3%  0.746 0.373  0.47 0.69
Montana 0 11  0.67%  0 0.074  1 0.93
Nebraska 12 46  4.50%  0.540 2.070  0.58 0.126
New Mexico 7 57  2.97%  0.208 1.693  0.81 0.184
North Dakota 1 9  1.08%  0.011 0.097  0.99 0.91
Utah 11 103  1.27%  0.140 1.308  0.87 0.27
Wyoming 2 0  1.29%  0.026 0  0.97 1

Even before we do a correction for having 51 hypotheses (50 states plus District of Columbia), none of these “no black students” states shows significant under-representation of black students. In fact, it would have been significantly surprising if the test taker in North Dakota had been black. None of the states had so few students that a black test taker would have been surprising (except Wyoming).

One can do similar computations to show that the lack of women in Mississippi, Montana, and Wyoming is not surprising.  Montana looks surprising if treated as a single hypothesis (p<0.004), but not after multiple-hypothesis correction (E-value=0.21). Even combining all three states (which increases the number of hypotheses enormously and would call for a stronger multiple-hypothesis correction), the under-representation of women in those states is not statistically significant.

There are states that do have significant under-representation of women: for example, Utah had 103 test takers, only 4 of whom were women. With an expected number of about 51.5, this is p<1.4E-16. Even with 51× multiple hypothesis correction, this under-representation is hugely significant.  Looking nationwide, total counts were 5485 female test takers out of 29555 total test takers.  That’s p< 1.4E-1677. The highest percentage of female test takers was in Tennessee, with 73 out of 251, which is  p< 2.6E-7, again highly significant.

Tennessee also had a high proportion of black test takers with 25 out of 251.  With an expected number of 42.12, this is p<0.003 (still significantly under-represented).  To see if black students were under-represented nationwide, one would have to add up the expected numbers for each state and see how the actual number compared with the expected number.  (I’m certain that the under-representation is hugely significant since even the states with high numbers of black test takers are under-represented,  but I’m too lazy to do the multiplication and addition needed.)

The case can clearly be made for female and black students being under-represented, though pointing to the states with 0 female or 0 black test takers is not the way to do it. (From a marketing standpoint, rather than a statistical one , shouting “no black test takers in these states”, “no female test takers in these other states” may be exactly the right way to get attention, even though the real story about blacks and females is in the states where there were enough test takers to say something about them after dividing them into subgroups.)

A case could also be made for some states having far fewer CS AP test takers than others.  One would need to come up with an expected number of test takers from some model (for example, by state population as a share of national population, or by number of total AP test takers in state as share of national total AP test takers).  The second model would correct for state-to-state differences in age distribution or in popularity of AP exam taking in general.  One could also base predictions on some other STEM test, such as AP Calculus, if one wanted to control for different amounts of STEM instruction in different states.

Let’s look at the states with no black test takers again, to see if they are significantly under-represented in CS.  There were 29555 AP CS tests taken nationwide and 3,824,691 AP tests nationwide total, so we would expect the CS tests taken in a state to be 0.77% of the total for the state.

state #  CS test takers # all test takers expected CS test takers p < E-value
Alaska 21 4570 35.31 0.0066 0.34
Idaho 6 47 9723 75.13 6.3E-25 3.3E-4 3E-23 1.7E-4
Kansas 12 47 15339 118.53 5.95E-36 6.25E-14 3E-34 3.2E-12
Maine 161 14051 108.58 0.9999
Mississippi 2 1 9032 69.79 1.23E-27 3.5E-29 6E-26 1.8E-27
Montana 0 11 4868 37.62 4.59E-17 3.4E-7 2E-15 1.7E-5
Nebraska 12 46 11117 85.91 1.9e-23 1.7E-6 1E-21 8.8E-7
New Mexico 7 57 13365 103.28 3.7E-35 4.7E-7 2E-33 2.4E-5
North Dakota 1 9 2295 17.73 3.7E-7 0.018 2E-5 0.91
Utah 11 103 35721 276.03 2.4E-101 5.6E-23 1E-99 2.8E-21
Wyoming 2 0 2050 15.84 1.9E-5 1.3E-7 0.00096 6.7E-6

Of these eleven states, eight appear to be under-represented in CS test takers (Maine is significantly over-represented in CS test takers).  When I do the multiple-hypothesis correction for having 51 different “states” (including the District of Columbia), the mild under-representation in Alaska and North Dakota is no longer significant, but the other nine eight are.

So the zero black AP CS test takers for the nine states can be fairly confidently attributed to the lack of AP CS test takers, and in Maine to the shortage of black students.  For Alaska, the lack of black AP CS test takers is probably due to the shortage of AP CS test takers in the state.

One can generalize the techniques here to any method of predicting the mean number of students in some category, to see whether the observed number is significantly smaller than the predicted number.  When the predicted number is small, even 0 students may not be statistically significant under-representation.

## 2013 June 19

### Millions for a fairly useless new test

According to the College Board Press Release: The National Science Foundation Provides \$5.2 Million Grant to Create New Advanced Placement® Computer Science Course and Exam.

Innovative College-Level AP® Course Created to Increase Interest in Computing Degrees and Careers, Particularly Among Female and Minority Students

To help ensure that more high school students are prepared to pursue postsecondary education in computer science, the National Science Foundation (NSF) is making a four-year, \$5.2 million grant to the College Board’s Advanced Placement Program® (AP®) to fund the creation of AP Computer Science Principles (AP CSP).

The college-level AP CSP course will be introduced into thousands of high schools nationwide in fall 2016, with the first AP CSP Exam set to be administered in May 2017. Unlike computer science courses that focus on programming, AP CSP has been designed to help students explore the creative aspects of computing while also providing a solid academic foundation for understanding the intellectual concepts and practical contributions of computing. AP CSP includes a curriculum framework designed to promote learning with understanding, a digital portfolio to promote student participation throughout the year, and a course and assessment that is independent of programming language.

Successful implementation of the AP CSP course will hinge on the ability to recruit and train qualified teachers with computer science backgrounds to teach the course. Through its CS 10K Project (10,000 computer science teachers in 10,000 high schools by 2016), NSF has been laying the foundation for an unprecedented, national effort to prepare educators to teach this new material using hands-on, inclusive curricula.

The college-level AP CSP course will be introduced into thousands of high schools nationwide in fall 2016, with the first AP CSP Exam set to be administered in May 2017. Unlike computer science courses that focus on programming, AP CSP has been designed to help students explore the creative aspects of computing while also providing a solid academic foundation for understanding the intellectual concepts and practical contributions of computing. AP CSP includes a curriculum framework designed to promote learning with understanding, a digital portfolio to promote student participation throughout the year, and a course and assessment that is independent of programming language.

I know that Mark Guzdial is fond of the Computer Science Principles (CSP) course that has been prototyped for a few years now at colleges, but I’m not convinced that it represents college-level course work (the supposed intent of AP courses and exams). I don’t know much more about the course than when I blogged about it in 2011, so my opinions in this article may reflect my own lack of knowledge about the course more than anything else.

I’m not saying that CSP is a bad course, or even that it is a bad introduction to computer science, but it seems to me to be at best high-school level. I know that many colleges disagree—the press release says

In a recent survey of 103 of the nation’s top colleges and universities, 87 percent confirmed that AP CSP requires the same content knowledge and skills as the related introductory college course, and 86 percent indicated a willingness to award college credit for qualifying scores on future AP CSP Exams.

There are also colleges that teach high school algebra and precalculus, but we don’t offer AP exams in them.

My own campus has several intro programming courses, some at the level of the AP CSP course.  I suspect that our campus would offer credit in these low-level courses for the AP CSP exam. These lowest-level courses do not count towards any major, though—they provide elective credit for what should be high-school level courses.  The intent (as is apparently the intent for AP CSP) is to provide an extremely low barrier to entry into the field.

I don’t know how well the low barrier to entry works, though.  I’ve not seen much evidence on our campus that the lowest level courses produce many students who continue to take higher level CS courses. Of course, I’ve not tried to get reports on that from the campus academic planning office, as I have enough to do without meddling in the affairs of other departments.  We still have appallingly low numbers of women finishing in CS (and the new game-design major within CS is even more heavily male), so I can’t say that the lower-level intro courses have done much to address the gender imbalance.

The success of CSP also depends on thousands of high schools suddenly deciding to teach the course and getting training for their teachers to do this. I (along with many others) have grave doubts that the schools have the desire or the ability to do this. It is true that the CSP course should be a bit easier to train people for than the current AP CS A course (if only because Java syntax, the core of CS A, is so deadly dull).

Even if, by some miracle, the NSF manages to train 10,000 teachers to program well enough to teach programming, the result is likely to be underwhelming.  I suspect that many will leave teaching—many of the math teacher bloggers I’ve followed who learned to program have moved out of teaching into being full-time programmers.    The result of the 10K project may not be a huge increase in high school CS teachers, but a loss of some of the better math teachers and the production of a core of under-trained programmers.

The justification for the new AP CSP course is that it will drive many more students in computing fields. The College Board continues to confuse correlation with causation:

Research shows that students who took college-level AP math or science exams during high school were more likely than non-AP students to earn degrees in physical science, engineering and life science disciplines — the fields leading to careers essential for the nation’s future prosperity.

Students wanting to do STEM fields in college often chose that path in high school, and took as many STEM courses as they could in order to get into good colleges.  Quite likely, wanting to do STEM in college caused them to take AP exams, not the other way around.

I’m not defending the current AP CS exam—from what I’ve heard about the AP CS A course and exam, it is mainly about Java syntax.  Personally, I think that Java is a poor pedagogical choice for a first programming language (I still favor the sequence Scratch, Python, C, Java), and using it as the language for the AP CS exam forces high schools into poor pedagogy.  The new CSP exam is not supposed to be so language-dependent, which may allow for better pedagogy.

Of course, I’m curious how the exam will be written to be language-independent, and whether it will be able to make any meaningful measurements of what the students have learned.  I’ve never been convinced that exams do an adequate job of measuring programming skills, and I’m not sure what the new exam will measure since the new course is “unlike computer science courses that focus on programming”.

I suspect that the easier AP CSP will replace AP CS A at many high schools, and that CS A will disappear the way that CS AB did in May 2009 (Gresham’s Law for pedagogy: easier courses drive out harder ones).  Whether this is a good or bad outcome depends on how good the AP CSP course turns out to be.

Overall, I’m simply not convinced that the College Board needs federal funding of \$5.2 million to develop a new exam.  They are going to make enough money off the new exam that they should be able to fund it without subsidies.

## 2013 June 11

### 2013 AP Exam Score Distribution

Filed under: Uncategorized — gasstationwithoutpumps @ 22:50
Tags: ,

As they’ve done for the past couple of years, Total Registration is once again collecting Trevor Packer’s tweets about the AP score distributions as they come out: 2013 AP Exam Score Distributions.

I don’t know why the College Board waits until December to release the statistics officially, letting Total Registration get a lot of web hits for information that College Board could easily host themselves.  Even my third-hand posts, 2012 and 2011, which are just links to the Total Registration site get a lot of views, thanks to good placement on the Google page.  In fact, the 2011 post is one of my most-viewed posts:

Stats summarizing the 17,584 views for my 2011 post pointing to the AP stats. Notice how the results peak during the tests and when the test results are released.

Incidentally, Total Registration has an ulterior motive for posting the AP scores—it drives teachers to their site, and they sell an online registration service to schools, since College Board has never bothered to set up their own online registration.  College Board doesn’t do it (though they do online registration for SAT and SAT2), because they don’t require testing centers to offer all the AP exams and they can’t be bothered to figure out which schools will offer which AP exams.

So far only the AP Computer Science results are posted, and the results will change once the late exams are posted (my son took the late exam, so I’m sure there is at least one more 5 there!).

Next Page »