Gas station without pumps

2010 November 25

Comments on “The Shadow Scholar”

Filed under: Uncategorized — gasstationwithoutpumps @ 00:09
Tags: , , ,

The Chronicle of Higher Education recently published an article titled “The Shadow Scholar” , purportedly by a writer who makes his living selling papers for students to turn in as their own. This sort of cheating has a long history, but has gotten easier with the greater ease of communication and search that the Internet provides.  It is now very easy for a lazy student (with money) to find someone to do their work for them.  Because they can draw on unethical writers from all over the world, it is unlikely that they will accidentally purchase a document that the professor will recognize as having seen before.  (I suspect that most term paper services are even less ethical than the Shadow Scholar, who claims to have created custom documents for each client, and will resell the same paper repeatedly for any assignment that it comes close to matching.)

In my classes, I’ve never graded a paper that I thought had been purchased.  Most of the papers I read seem to be in the voice of the student submitting them, and usually have direct connection to things done in class, making them hard to fake from a distance.  I do have occasional problems with students plagiarizing from the Web, but even there the problem is more often one of inadequate citation than of deliberate attempt to deceive about authorship.

Many edubloggers have commented on the shadow-scholar article. For example, Mark Guzdial is concerned that

It’s even easier to cheat with code, since there are fewer degrees of freedom.  My guess is that cheating as he describes [it] is even more prevalent in computer science.

Cheating in beginning programming courses is certainly a common problem, but there are some good programs available for detecting submissions that are suspiciously similar to other submissions (from the same or previous years).  I know that the computer science department at our university routinely checks the submissions in the beginning programming classes, and flunks some students for cheating every year.  No matter how many times students are warned both of the high probability of being caught and of the serious consequences of getting caught, there are always idiots who think they are immune.  (They are usually the stupidest, most ego-centric students, so the University is better off catching them early and throwing them out.)

I found Katrin Becker’s comments interesting also:

There are also people I know personally who conveniently look the other way when students (especially their own) produce work that is suspiciously good. One colleague of mine has theorized that fully 1 in 5 faculty members got to where they are now through some form of plagiarism.
1 in 5.
The corporatization of Higher Ed is a significant influence. When everything becomes about money then everything acquires a price tag.

Personally, I doubt that there are that many plagiarists among the faculty, but I do agree that the attempt to make a college education into a commodity has resulted in loss of integrity among students.  Many feel that the large fees they pay for their education entitles them to good grades and a degree, independent of what they actually manage to do.  Many politicians support this view, rewarding universities for having high retention and large percentages of entering students graduating within 4 years.

Some students should not be retained, some should not graduate. Getting a degree from college should not just be a matter of putting in seat time and paying tuition for 4 years.  (Of course, graduating from high school or elementary school should also not just be a matter of putting in seat time, but I fear that battle was lost decades ago.)

2010 November 24

New video camera

Filed under: Uncategorized — gasstationwithoutpumps @ 00:07
Tags: , ,

I bought my first video camera recently.  I wanted it for two purposes: to record my son’s theatrical performances and to record student presentations in class (both practice talks in the how-to-be-a-graduate-student course and first-year grad student lab rotations).

Of the two main tasks, the more demanding one is recording stage plays.  The lighting is often quite low (partly to establish mood, but partly because theatrical lights are expensive, and the children’s theater group my son performs with does not have that much money).  I set out to find the best low-light camcorder I could afford.

It turns out that finding reviews of cameras on the web is easy, but finding really useful information is difficult.  The most informative reviews that I could find were at

Introduction Product Tour Color & Noise Motion & Sharpness
Low Light Performance Compression & Media Manual Controls Still Features
Handling & Use Playback & Connectivity Audio & Other Features Comparison (with competitor)
Comparison (with competitor) Comparison (with competitor) Conclusion Specs and Ratings

Based primarily on their reviews (checking with a few other sources to see if there were disagreements), I reduced my choices to two cameras: JVC Everio HD620 and Panasonic HDC-TM700.  These were the two cameras that had the best low-light performance in their tests.  The Panasonic appeared to be the better camera in almost all respects, but it was twice the price, and I am still not certain I will do much video recording.

I also talked with my brother, who has been making a series of video recordings (under better lighting conditions) for a web project that he is working on.  He spoke highly of the Panasonic camera and warned me to look at other considerations as well as just the low-light resolution, noise, and color rendition reported in the reviews.

The price was the deciding factor for me, and I got the JVC camera.  If I decide I like to record video, then in a year or two I can get a much better camera at a lower price than I could get one now.

I used the camera first to record students doing presentations in a normally lit classroom.  The camera worked fine for this purpose, though I found my tripod a little harder to use for smooth panning and tilting than I thought, and the zoom control on the camera is difficult to use: the speed of the zoom is difficult to control.  Part of the problem with the tripod is that the tripod mount for the camera is not below the center of gravity for the camera, so the camera will tilt upward unless there is enough friction to resist it, but then the stick-slip friction results in jerky motion. Putting a bigger battery on the back of the camera for longer recording time makes the problem even worse.  Panning was less of a problem than tilting, since there is no tendency for the camera to rotate.

This past weekend, my son had 4 play performances with his WEST Performing Arts theater class.  There were 2 performances for each of 2 casts. All the kids were in both casts, but with different parts.  This is a common trick used at WEST to give more kids a chance at big roles and to develop some experience with repertory theater, even in a single class.  I decided to record all 4 performances: 2 using a still camera (a Canon G10), as I have done for many previous performances, and 2 using the new video camera.  I took still photos for the first play by each cast, and videos of the second play by each cast.  Because there were different casts for each performance (yes, all 4 were different, due to kids getting ill during the weekend), I can’t edit different video recordings together.

So how well did my first attempt work?  Did the camera live up to my expectations?

Well, yes and no.  The video camera was much better at low-light recording than my Canon G10 (even though I set the shutter speed on that camera to 1/40 second and used an ASA 800 sensitivity on the Canon).  The corner of the stage that always came out very dark in the still photos looked quite normal in the videos.  But the camera has some serious problems for recording stage plays.

First, its automatic gain control on lighting is very, very slow to respond to increases in light.  When they turn on the lights at the beginning of a scene, the image whites out and becomes unusable for several seconds. I had expected this problem, but it was worse than I had expected.

Second, and more problematic, the autofocus does not work well in low lighting, particularly not bluish low lighting, taking as long as 15 seconds to focus, and being seriously out of focus (to the point where people are just blobs) for most of that time. Manual focus is not really an option on the JVC camera, since it can only be done through the awkward touch screen, which does not allow adequate control for anything which requires refocusing while filming. Again, I had expected this problem (since the still camera also has autofocus problems in low light), but it was much worse than I expected.

Third, the 30x zoom on the camera, which I though would be very useful, turns out to be nearly useless, because the camera does not have a sufficiently wide angle in its widest setting.  I could zoom in on someone’s fingernail if I wanted to (and could track it), but I can’t zoom out enough to get any of the ensemble scenes, and even some of the dialog required me to pan back and forth between the speakers. I ended up at 1x or 2x for the entire filming, and wishing I were in the center of the zoom range, instead of stuck at one end.  I had not expected this problem.  It had not occurred to me that they would short the wide-angle capability that much, especially in a camera that is designed to be mainly handheld, so the extreme telephoto would be useless.

Another annoyance is that it takes about 45 minutes to transfer an hour of video recording from the camera to my laptop, and it will take me even longer (5 hours?) to convert that to a format that can be put on a flash drive or DVD to give back to the teacher and the other parents.

I have thought of something I can do to fix up the bad behavior of the camera (light bloom and out of focus) for the beginning of each scene, though it will be a lot of work.  Assuming I have a still photo of the beginning of the scene from my photographing the earlier production, I may be able to use the Ken Burns technique to replace the bad visual recording with some panning and zooming on the still photo, keeping the audio from the video recording.  It’ll take me some time to figure out how to do that in iMovie, which is the only video-editing software I have.  I’ll also have to figure out how to do scrolling credits with the cast list at the end and get the correct names of the kids: I know that one of the names on the program is misspelled, but I don’t know the correct spelling.

The post-production work, even for a very crude recording of the plays, will probably take much longer than I usually take to crop and clean up a dozen or so still photos out of the 200–300 I usually take during a performance.

2010 November 23

Few taking AP CS A exam

Filed under: Uncategorized — gasstationwithoutpumps @ 00:05
Tags: , , , ,

I recently looked at the statistics on number of students taking AP exams, by state. This was prompted by a post on Mark Guzdial’s blog, which provided a summary of the state-by-state statistics for the AP Computer Science A exam.

The discussion on Mark’s blog indicates the difficulty in interpreting statistics:  the post started out congratulating Georgia high-school teachers for the increase in the number of students taking the AP CS A exam and improvement in scores, but commenters on the blog pointed out that the demise of the CS AB exam meant that the total number of students taking AP CS exams had dropped and that the rise in scores probably reflected the students who in the past would have taken the more difficult CA AB exam taking the CS A exam instead.  The bottom line seems to be that there has been no improvement in Georgia high-school teaching of CS, and probably some loss of quantity or quality.

Of course, I’m not in Georgia, so I was more interested in California, and how it was doing. In the state-by-state figures, California had the second highest number of AP CS A test takers in 2010 (after Texas).  Of course, both are big states, so per capita rates are more interesting. Texas drops to number 4 and California to number 9.  One state stands out as having had a high number of AP CS A test takers: Maryland at 241 test takers per million population.  The next highest was Virginia at  150.  California had 76 test takers per million.  The lowest state (as in most measures of educational attainment) was Mississippi at 1.7 test takers per million.  I am a bit worried about how California plans to retain anything in Silicon Valley if so few Californians are learning to program.  Somehow I’m having trouble imagining Baltimore taking over as the center of the computer industry, but stranger things have happened. Maryland’s average score on the AP CS A exam was 3.03,  somewhat lower than California’s 3.34 and the national 3.14, but not so low that Maryland can be accused of packing the exam with unprepared test takers. Maryland, California, and the nation as a whole have bimodal distributions, with scores of 5 and 1 being the most common.

I looked over the national statistics for 2010, and saw that many of the exams that require math had this sort of bimodal distribution.  Fields with multiple exams (like Calculus or Physics) tended not to have the bimodal distribution on the harder exams, with scores peaking at the high end on the hard exams.  This is not too surprising, as students who fail the easy exam are unlike to go on and fail the hard one as well.  The humanities fields tend to have unimodal distributions centered around 3, rather than with peaks at the ends like the math-based exams.  Looking over all fields, the lowest mean scores were on Human Geography (2.46) and the highest on Chinese Language and Culture (4.56, but only taken by 4832 students, probably mostly native speakers).  The test with the most 5s is Calculus AB (48752), and the test with the most 1s is also Calculus AB (79457).  That’s not because it is the most common test—that would be US History, with 384566 test takers, but the most frequent score there is a 2, rather than a 1 or 5.

The Computer Science AP A exam had only 19390 test takers (versus 236502 for Calculus AB and 75132 for Calculus BC), so there are about 16 times as many high school students getting college-level calculus classes as college-level computer programming classes.  This ratio looks wrong to me, as there are far more jobs that require programming than there are that require calculus.  (OK, job preparation is not main purpose of high school education, but I could argue for the greater improvement in cognitive skills that comes from programming rather than calculus also.) My own son’s high school doesn’t offer any computer programming, though they do have Calculus AB and BC.  Perhaps the problem is that there are not enough unemployed programmers retraining to be underemployed teachers.  It may be easier to convince math teachers to learn programming than to convince programmers to become teachers (of course, one the math teachers have learned enough programming to teach it competently, many will drift off to industry to get the higher pay).


2010 November 22

Computer Science Education Week

Filed under: Uncategorized — gasstationwithoutpumps @ 00:04
Tags: , ,

December 5–11, 2010 is Computer Science Education Week According to the web site,

CSEdWeek aims to:

  • Eliminate misperceptions about computer science and computing careers
  • Communicate the endless opportunities for which computer science education prepares students within K-12, and into their higher education and careers
  • Provide information and activities for students, educators, parents, and IT professionals to advocate for computer science education at all levels

I guess that misperceptions about punctuation are not included in their agenda, as they have incorrectly placed a colon between the word “to” and the infinitive verbs of the list, and omitted the punctuation between the list elements.  If they made those sorts of punctuation errors in their programming, they’d never get anything through their compilers.

Their grasp of logic and rhetoric is also weak in their justification:

Why is Computer Science Education Important?

  • It exposes students to critical thinking
  • It is essential for success in the digital age
  • Too few students are exposed to opportunities presented by computer science

The list is not parallel, and the last sentence provides no explanation for why CS education is important.  It expresses a shortage of CS education, which may be a consequence of CS education being important and rare, but is not a reason for CS education being important.  If their programs involved this sort of circular reasoning, they’d have infinite recursion all over the place.

ACM and its partners in CSEdWeek are asking people to take the CSEdWeek pledge:

I pledge to participate in and/or support (no donation required) Computer Science Education Week (CSEdWeek), December 5-11, 2010, to raise awareness of the role computing plays in all our lives and to promote computer science education for all students.

Sounds good, except that there don’t seem to be any activities to participate in or support.

Mark Guzdial, who is on the steering committee for CSEdWeek, says on his blog that there will a new web site up by Nov 29 (lots of lead time there for a December 5 event!), and asks people to

sign up to do something, from simply blogging on CSed Week, to speaking to a group of high school students about CS Ed, to working with groups of undergraduates to visit a bunch of elementary schools and give demos of Alice and Kodu.  Invite others to pledge, as well, to show that the CS education community is active and cares about promoting computing education.

Well, I’ve blogged now about CSEd Week, but I doubt that they’ll get much traction this year.  Their advertising for the (non)event is too little and too late to have any noticeable effect outside their own committee meetings.  If the event continues to exist, perhaps future years will have something that is marginally visible to the outside community (sort of like National Engineering Week, 20–26 Feb 2011).

Incidentally, if someone is planning to give demos in elementary schools, I’d recommend Scratch rather than Alice or Kodu.

2010 November 21

What are the odds of killing 2 bicyclists?

Filed under: Uncategorized — gasstationwithoutpumps @ 00:05
Tags: , , ,

A truck driver was recently involved in a crash which killed a bicyclist for which there were no witnesses.  It turns out that he had been involved in a crash a few years ago that killed a bicyclist.  What are the odds that he is just unlucky (as opposed to the alternative hypothesis that he is a bad truck driver)?

Here is a news story on the second crash:

Truck driver in Alpine Road collision also involved in fatal 2007 crash in Santa Cruz
By Jesse Dungan

The driver of a big rig that killed a female bicyclist near Portola Valley earlier this month was involved in a similar crash in Santa Cruz three years ago that left a popular high school teacher dead, a California Highway Patrol spokesman confirmed Monday.

The driver, Gabriel Manzur Vera, was determined not to be at fault in the Santa Cruz collision, said CHP Officer Art Montiel. Vera also has not been charged in the Nov. 4 crash on Alpine Road, he said. …

The truck driver’s employer recently settled a wrongful-death suit for the first death, paying $1.5 million.

How can we estimate the odds and determine whether this can reasonably be expected as a chance event?

As always when we want to compute the probability of a chance event, we need to define our null model carefully, and make sure that we use appropriate corrections for multiple hypotheses.  We probably want our null models to slightly overestimate the probability of killing bicyclists, so that our models will err on the side of saying that the event of one driver killing 2 bicyclists is not strange enough to be statistically significant.

Let’s look at two simple null models, one based on number of drivers, one based on miles driven.

First, how likely is any driver to kill a bicyclist in a given year? In their lifetime? What about the probability of killing two bicyclists in separate incidents in their lifetime?  How many such double-death motorists should we expect to find in the US?

Second, how likely is a driver to kill a bicyclist for each mile driven? How many miles does a truck driver drive in a year?  How likely does that make it for a given truck driver to kill a bicyclist in a year?

If the probability that a driver kills a cyclist in a year is p, and the events are independent events, unrelated to the driver’s skill, then the chance that a driver will ever kill a cyclist in a 60-year driving career is 1-(1-p)^{60} and the chance that a driver will kill 2 or more cyclists in separate years is about 1 - (1-p)^{60} - 60 p (1-p)^{59}, which we can rewrite as 1- (1-p)^{59}(1+59p). I picked a 60-year career as an over-estimate for most truck drivers, as we are trying to overestimate the number of deaths.

There were about 716 bicyclists killed in motor vehicle crashes in 2008 [] The number fluctuates from about 630 to 830 over the past two decades, so 716 seems fairly typical.  But since we want to overestimate the probability of killing bicyclists, let’s round up to 990 a year, bigger than BTS has recorded since about 1975.

In 2006 (the latest year I could find numbers for) there were 202,810,438 licensed drivers [].  The number was probably higher in 2008, but we are trying to overestimate the probability of killing a bicyclist, so we should underestimate the number of  drivers.  With this number and the overestimate of bicyclist deaths, we can estimate that one bicyclist is killed each year per 205,000 drivers. The chance that a driver ever kills a cyclist is about 0.0003.  (Double-checking, that comes to 60,843 cyclists killed in 60 years, a high number, but consistent with our over-estimates.)  The chance that a driver kills two cyclists in separate incidents over a 60-year career is about 1 in 23,700,000, so the expected number of people who would do that is about 8.5.  Using this model, we would expect about 8 or 9 double-killers in the US in 60 years, so the event is rare, but not so rare as to immediately claim that the event of killing 2 cyclists is proof of incompetent driving. (It certainly raises that suspicion very strongly, though).

There were about 2,973,509 million miles driven in 2008 [], so there is a bicyclist killed for about every 3 billion miles driven.

We’re trying to overestimate the probability of a truck driver killing a bicyclist, so we need to overestimate (somewhat) the number of miles a truck driver drives.  Let’s assume he drives 10 hours a day, 364 days a year at 55 mph.  That would be about 200,000 miles and over 60 years 12 million miles would be driven (this is a somewhat large overestimate). There are about 3.5 million truck drivers driving 400 billion miles a year [], or about 114,000 miles a year, so the lifetime driving of a truck driver is only over estimated by a factor of 2–4 times.

The probability of a particular truck driver ever killing a bicyclist is about 1- (1-1/3000000000)^{12000000} (treating each mile driven as independent), which we can estimate by using the approximation (1-x/N)^N \approx e^{-x} for large N.  So the probability of a truck driver ever killing a bicyclist is about 1 in 250.  Checking, we would expect 3.5 million/250 or 14000 bicyclists killed by trucks over 60 years in the US.  I don’t have numbers for bikes killed by trucks, but 233 a year seems reasonable, given the number killed by all motor vehicles combined.

The probability of killing 2 or more bicyclists in separate miles is 1 - (1-p)^{N} - N p (1-p)^{N-1}, where p is the probability of killing in any given mile, and $N$ is the number of miles. We can rewrite that as 1 - (1-p)^N (1+Np/(1-p)), which for our truck-driver estimates is about  1 in 125,000.  With 3.5 million truck drivers around, we would expect to find about 28 double-bicyclist-death truck drivers by chance in 60 years.

With both models, using overestimates of how likely bicyclists are to be killed by chance, we get an expected number of chance double-death truck drivers of between 1 such driver every 2 years in the country and one every 8 years.  This means that we can’t completely reject the null model (that the driver was just unlucky enough to have 2 chance encounters), but our suspicions about the driver should certainly be raised, and other evidence checked to see whether the driver is really incompetent to be driving a truck.

Late-breaking news: It seems that Gabriel Vera was involved in 3 fatal accidents, not just two, so there is no question in my mind that he isn’t just an unlucky driver, but should never be allowed to drive again.

« Previous PageNext Page »

%d bloggers like this: