# Gas station without pumps

## 2014 April 11

### Arthur Benjamin: Teach statistics before calculus!

I rarely have the patience to sit through a video of a TED talk—like advertisements, I rarely find them worth the time they consume. I can read a transcript of the talk in 1/4 the time, and not be distracted by the facial tics and awkward gestures of the speaker. I was pointed to one TED talk (with about 1.3 million views since Feb 2009) recently that has a message I agree with: Arthur Benjamin: Teach statistics before calculus!

The message is a simple one, though it takes him 3 minutes to make:calculus is the wrong summit for k–12 math to be aiming at.

Calculus is a great subject for scientists, engineers, and economists—one of the most fundamental branches of mathematics—but most people never use it. It would be far more valuable to have universal literacy in probability and statistics, and leave calculus to the 20% of the population who might actually use it someday.  I agree with Arthur Benjamin completely—and this is spoken as someone who was a math major and who learned calculus about 30 years before learning statistics.

Of course, to do probability and statistics well at an advanced level, one does need integral calculus, even measure theory, but the basics of probability and statistics can be taught with counting and summing in discrete spaces, and that is the level at which statistics should be taught in high schools.  (Arthur Benjamin alludes to this continuous vs. discrete math distinction in his talk, but he misleadingly implies that probability and statistics is a branch of discrete math, rather than that it can be learned in either discrete or continuous contexts.)

If I could overhaul math education at the high school level, I would make it go something like

1. algebra
2. logic, proofs, and combinatorics (as in applied discrete math)
3. statistics
4. geometry, trigonometry, and complex numbers
5. calculus

The STEM students would get all 5 subjects, at least by the freshman year of college, and the non-STEM students would top with statistics or trigonometry, depending on their level of interest in math.  I could even see an argument for putting statistics before logic and proof, though I think it is easier to reason about uncertainty after you have a firm foundation in reasoning without uncertainty.

I made a comment along these lines in response to the blog post by Jason Dyer that pointed me to the TED talk. In response, Robert Hansen suggested a different, more conventional order:

1. algebra
2. combinatorics and statistics
3. logic, proofs and geometry
5. calculus

It is common to put combinatorics and statistics together, but that results in confusion on students’ part, because too many of the probability examples are then uniform distribution counting problems. It is useful to have some combinatorics before statistics (so that counting problems are possible examples), but mixing the two makes it less likely that non-uniform probability (which is what the real world mainly has) will be properly developed. We don’t need more people thinking that if there are only two possibilities that they must be equally likely!

I’ve also always felt that putting proofs together with geometry does damage to both. Analytic geometry is much more useful nowadays than Euclidean-style proofs, so I’d rather put geometry with trigonometry and complex numbers, and leave proof techniques and logic to an algebraic domain.

## 2012 November 10

### A probability question

Filed under: Uncategorized — gasstationwithoutpumps @ 21:59
Tags: , ,

Sam Shah, one of the math teacher bloggers that I read, posted a bioinformatics-related question on A biology question that is actually a probability question « Continuous Everywhere but Differentiable Nowhere:

Let’s say you have a sequence of 3 billion nucleotides. What is the probability that there is a sequence of 20 nucleotides that repeats somewhere in the sequence? You may assume that there are 4 nucleotides (A, C, T, G) and when coming up with the 3 billion nucleotide sequence, they are all equally likely to appear.

This is the sort of combinatorics question that comes up a lot in building null models for bioinformatics, when we want to know just how weird something we’ve found really is.

Of course, we usually end up asking for the expected number of occurrences of a particular event, rather than the probability of the event, since expected values are additive even when the events aren’t independent.  So let me change the problem to

In a sequence of N bases (independent, uniformly distributed), what is the expected number of k-mers (k≪N). Plug in N=3E9 and k=20.

The probability that any particular k-mer occurs in a particular position is 4-k, so the expected number of occurrences of that k-mer is N/4k, or about 2.7E-3 for the values of N and k given. Oops, we should count both strands, so double that to 5.46E-3.

When the expected number is that small, we can use it equally well as the probability of there being one or more such k-mers. (Note: this assumes 4k ≫ N.)

Now let’s look at each k-mer that actually occurs (all 2N of them), and estimate how many other k-mers match. There are roughly 2N/4k for each (we can ignore little differences like N vs. N-1), so there are 4 N2/4k total pairs. But we’ve counted each pair twice, so the expected number of pairs is only 2 N2/4k, which is 16E6 for N=3E9 and k=20.

We have to take k up to about 32 before we get expected numbers below 1, and up to about 36 before having a repetition is surprising in a uniform random stream.

## 2011 May 24

### Why Discrete Math Is Important and The Calculus Trap

Filed under: Uncategorized — gasstationwithoutpumps @ 20:19
Tags: , , , ,

My son is nearing the end of his Art of Problem Solving precalculus class, and it is still going well, as I reported earlier.  We are now looking at what to do next.  Should he take Calculus BC at his high school in the Fall? Should he take the AoPS Calculus class this fall?  Or should he detour into a different branch of math?

The Art of Problem Solving people have written a couple of essays about pre-college math preparation:

• The Calculus Trap. The basic premise of this piece is that the lock-step march through arithmetic, algebra, geometry, algebra, trigonometry, precalculus, and calculus is not the only or best way to study math.  Elementary and secondary math education is not a race to see who gets to the “finish line” (calculus) soonest.  Indeed, from a mathematician’s viewpoint, calculus is just one of many starting points for interesting math.  A lot of what gets tossed aside in a race to calculus is more interesting.
Even more important: “the standard curriculum is not designed for the top students.”  They point out that being the top student in your class is not the way to make progress in your learning—you are better off getting to a level of challenge that really makes you exercise your problem-solving skills.  Racing through courses full of drill problems is not going to do that the way working on harder problems will.  There are plenty of hard problems that do not need a lot of mathematical machinery to solve, so students can start on them before having learned all the machinery.
• Why Discrete Math Is Important. As a computer scientist and computer engineering who taught applied discrete math a few times, and as a bioinformatician who has to teach grad students all over again how to count (that is, how to do simple combinatorics) and how to do simple Bayesian probability, I am certainly in agreement that discrete math is important.  In fact, a big part of the reason I ended up in computer science rather than pure math was that I liked discrete math (graph theory and combinatorics) better than real or complex analysis, and I had made the mistake of starting my graduate math education in a department that had no discrete math. Luckily there were four or five great people doing combinatorics, graph theory, and graph algorithms in the computer science department, and I was able to switch departments.  That turned out very well for me, so perhaps it was a good thing that I’d chosen the wrong math department.

Most likely we’ll continue the standard progression, doing either Calculus BC or AoPS calculus next year.  After that, there will be time for applied discrete math and for probability and statistics before he goes off to college.