Gas station without pumps

2013 October 10

xkcd: Null Hypothesis

Filed under: Uncategorized — gasstationwithoutpumps @ 22:10
Tags: , , , ,

I have a number of my favorite xkcd comics on the bulletin board outside my office, including this one:

(Click on the image to go to the website to get the mouseover—I don’t have it on my bulletin board, but it is worth the effort of clicking.)

Today I had a student ask me to explain the joke—more precisely, to explain what “the null hypothesis” was.  I did so, of course, explaining how the p-values that calculate how likely something is “by chance” need a  formal definition of “chance”—the null model or null hypothesis.  I even explained that all statistical tests do is to allow you to reject (or not reject) the null hypothesis—that they tell you nothing about the hypothesis you are actually testing.

Given the casual nature of the question, I did not go into detail about how important it is to choose or construct good null models—ones that contain all the explanations other than hypothesis you hope to test.  I normally spend a full lecture on that in my bioinformatics course, as well as one of the weekly homework assignments, having the students program different null models for an open reading frame that result in very different p-values for a protein-coding gene detector.

Normally, I enjoy this sort of conversation with students—I like students who are curious and who are unafraid to ask questions to clear up things that confuse them.  Today I was a little disturbed by the question, as the student had been in my office to get a signature on an approval form for a senior thesis in bioengineering.  How had the student gotten that far in our program without having learned what a null hypothesis is?  Where is the hole in our curriculum that allows that, and how can we fix it in the curriculum redesign this year?

I realize that no curriculum design can completely cure the cram-and-forget disease that infects many college students, but I did not get the impression that this was a student who had known it once but forgotten.  Rather, I had the impression that the concept was a new one, though the name might have appeared before.

On looking over the bioengineering curriculum I see that students can take a probability course without any statistics course—perhaps that is what happened here.  Unfortunately, biomolecular experimentalists have to be very familiar with null hypotheses and statistical tests, so I think we have patch the curriculum to make sure that all the students get statistics.

2012 September 30

Bad question for math classes

Filed under: Uncategorized — gasstationwithoutpumps @ 11:02
Tags: , , ,

Mr. K, a math teacher, in his blog post Math Stories : Quick check for understanding, suggested asking students the following question:

Pick two numbers, randomly. What is the probability that the product will be even?

This question illustrates the standard problem with the way high schools teach (or rather don’t teach) probability, a problem that still plagues scientists even after they are in grad school. The problem is that “picking randomly” is not an adequate description of a process, as no distribution is given. For example, I can pick two numbers randomly and always get an odd number, if the distribution I pick from has zero probability on the evens.

There are other problems with the question, like what Mr. K meant by “numbers”, but we’ll be generous and assume that in his classes “number” means “integer”, since the odd/even question otherwise makes no sense. It is also possible that Mr. K meant “decimal digit”, in which case the complaint I’m making below does not apply, and I’d have to change my critique to the sloppiness of using “number” to mean “digit”, but in my experience math teachers are fairly careful about the number/digit distinction, but sloppy about the word “randomly”, and so I will continue as if the problem is with the use of “randomly”. (Aside: that last, overly long sentence is an informal version of Bayesian reasoning.)

Often, when no distribution is given, people mean that they used the uniform distribution.  More specifically, they usually mean an independent, identically distributed process (i.i.d) drawn from a uniform distribution, that is, one where each item is chosen independently from the same underlying distribution, which happens to be a uniform distribution.  This is a mathematically easy choice to analyze, though it is often a poor choice for a null model. Bayesian statistics was panned for centuries, because of the poor choice of uniform priors in the initial development of the methods.

But for a countably infinite set like the integers, a uniform distribution can’t be defined—some numbers must be more probable than others or you can’t get the probabilities to sum to one over all integers. (You have the same problem with integrals if you choose the real or complex numbers.)  That means the standard meaning of “pick randomly” doesn’t exist, and the question is badly formed.

Mr. K probably meant for students to pick integers randomly from a distribution in which odd and even numbers were equally likely, in which case his question becomes trivial, with the probability of an even product being ¾.  We can even go a step further and say that if the probability of choosing an odd number in the underlying distribution is p, then the probability that the product is odd is p2.

The important part of the question is not the properties of odd and even numbers or of multiplying probabilities of independent events, which I believe are the understandings that Mr. K was interested in, but the importance of knowing what distribution you are drawing from, which I don’t believe Mr. K gave any thought to.

The same problem comes up all the time when biologists (and, presumably, other scientists) try to use p-values to establish statistical significance.  The p-value is thought of as the probability that the observed data could have arisen “by chance”, which is as bad a phrase as “picking randomly”. The crucial question is what that “chance” process is—for what distribution are you computing the probability?  This is the “null model” or “null hypothesis” and for the p-value to be at all interpretable, the null model must include every phenomenon other than the hypothesis you are testing.  That is, your null hypothesis is as complicated as your hypothesis, but leaves out one (hopefully crucial) idea.

Most scientists choose a mathematically tractable model that is unlikely to fit any real data (like the i.i.d. with a uniform distribution).  They then show that this ridiculously simple model can’t explain the data, and jump from there to the conclusion that their hypothesis must be correct (a leap of faith more suitable for theology than science).  All that they have really shown is that the stupid model is wrong, which everyone already knew.  (Except sometimes the data is so meager that even the stupid, obviously wrong model can’t be ruled out by frequentist techniques, but those papers rarely get published.)

A large fraction of the papers I read or referee have badly designed null models, often invalidating their conclusions, so I believe that this sloppiness about the use of the word “randomly” is a serious problem in the education of future generations of scientists.  And it isn’t hard to rewrite questions to avoid the sloppiness by being explicit about the random process:

  • Pick two integers from a distribution in which odd and even numbers equally likely.
  • Pick two numbers uniformly with replacement from the integers from 1 to 10.
  • Roll a fair 20-sided die twice.
  • Draw a card from a deck containing just ace through 10 cards (no face cards).  Replace the card, shuffle the deck, and draw again.

For the age group that Mr. K teaches, the dice or card processes are probably the best to use, as the vocabulary used by probability theorists is more likely to confuse than to clarify.

%d bloggers like this: