Gas station without pumps

2013 March 21

Why Python first?

Filed under: home school,Uncategorized — gasstationwithoutpumps @ 11:21
Tags: , , , , , , ,

On one of the mailing lists I subscribe to, I advocated for teaching Python after Scratch to kids (as I’ve done on this blog: Computer languages for kids), and one parent wanted to know why, and whether they should have used Python rather than Java in the home-school course they were teaching.  Here is my off-the-cuff reply:

Python has many advantages over Java as a first text-based language, but it is hard for me to articulate precisely which differences are the important ones.

One big difference is that Python does not require any declaration of variables. Objects are strongly typed, but names can be attached to any type of object—there is no static typing of variables. Python follows the Smalltalk tradition of “duck typing” (“If it walks like a duck and quacks like a duck, then it is a duck”). That means that operations and functions can be performed on any object that supports the necessary calls—there is no need for a complex class inheritance hierarchy.

Java has a lot of machinery that is really only useful in very large projects (where it may be essential), and this machinery interferes with the initial learning of programming concepts.

Python provides machinery that is particularly useful in small, rapid prototyping projects, which is much closer to the sorts of programming that beginners should start with. Python is in several ways much cleaner than Java (no distinction between primitive types and objects, for example), but there is a price to pay—Python can’t do much compile time optimization or error checking, because the types of objects are not known until the statements are executed. There is no enforcement of information hiding, just programmer conventions, so partitioning a large project into independent modules written by different programmers is more difficult to achieve than in statically typed languages with specified interfaces like Java.

As an example of the support for rapid prototyping, I find the “yield” statement in Python, which permits the easy creation of generator functions, a particularly useful feature for separating input parsing from processing, without having to load everything into memory at once, as is usually taught in early Java courses. Callbacks in Java are far more complicated to program.

Here is a simple example of breaking a file into space-separated words and putting the words into a hash table that counts how often they appear, then prints a list of words sorted by decreasing counts:

def readword(file_object):
    '''This generator yields one word at a time from a file-like object, using the white-space separation defined by split() to define the words.
    for line in file_object:
        for word in words:
             yield word

import sys
count = dict()
for word in readword(sys.stdin):
     count[word] = count.get(word,0) +1
word_list = sorted(count.keys(), key=lambda w:count[w], reverse=True)
for word in word_list:
    print( "{:5d} {}".format(count[word], word) )

Note: there is a slightly better way using Counter instead of dict, and there are slightly more efficient ways to do the sorting—this example was chosen for minimal explanation, not because it was the most Pythonic way to write the code. Note: I typed this directly into the e-mail without testing it, but I then cut-and-pasted it into a file—it seems to work correctly, though I might prefer it if if the sort function used count and then alphabetic ordering to break ties. That can be done with one change:

word_list = sorted(count.keys(), key=lambda w:(-count[w],w))

Doing the same task in Java is certainly possible, but requires more setup, and changing the sort key is probably more effort.

Caveat: my main programming languages are Python and C++ so my knowledge of Java is a bit limited.

Bottom-line: I recommend starting kids with Scratch, then moving to Python when Scratch gets too limiting, and moving to Java only once they need to transition to an environment that requires Java (university courses that assume it, large multi-programmer projects, job, … ). It might be better for a student to learn C before picking up Java, as the need for compile-time type checking is more obvious in C, which is very close to the machine. Most of the objects-first approach to teaching programming can be better taught in Python than in either C or Java. For that matter, it might be better to include a radically different language (like Scheme) before teaching Java.

The approach I used with my son was more haphazard, and he started with various Logo and Lego languages, added Scratch and C before Scheme and then Python.  He’s been programming for about 6 years now, and has only picked up Java this year, through the Art of Problem Solving Java course, which is the only Java-after-Python course I could find for him—most Java courses would have been far too slow-paced for him.  It was still a bit low-level for him, but he found ways to challenge himself by stretching the assigned problems into more complicated ones.  His recreational programming is mostly in Python, but he does some JavaScript for web pages, and he has done a little C++ for Arduino programming (mostly the interrupt routines for the Data Logger code he wrote for me).  I think that his next steps should be more CS theory (he’s just finished an Applied Discrete Math course, and the AoPS programming course covers the basics of data structures, so he’s ready for some serious algorithm analysis), computer architecture (he’s started learning about interrupts on the Arduino, but has not had assembly language yet), and parallel programming (he’s done a little multi-threaded programming with queues for communication for the Data Logger, but has not had much parallel processing theory—Python relies pretty heavily on the global interpreter lock to avoid a lot of race conditions).

2012 December 19

Tested Python and Arduino installation

Filed under: Circuits course,Data acquisition — gasstationwithoutpumps @ 18:22
Tags: , , ,

My son and I bicycled up to campus today to test out his data logger code (that will be used in the circuits course) on the Windows computers in the lab.  The lab support staff had told me yesterday that they had gotten Python 2.7.3 installed and the Arduino 1.0.3 development environment, as well as the PySerial module that the data logger code requires.

My son has been working pretty intensely on the code lately, doing a complete refactoring of the code to use TKinter (instead of PyGUI, which is difficult to install and quite slow) and to have a more user-friendly GUI.  He also wanted to make the code platform independent (Windows, Mac, and Linux), though he’d only tested on our Macs at home before today.  He’s also trying to make the Python part of the code (the user interface) work in both Python 2.7.3 and Python 3.3, though the languages are not precisely compatible.

We found three problems today:

  1. Python 2.7.3 was not quite completely installed.  They’d forgotten to update the path to include C:\Python27\  (The path should also include where the Arduino software was installed—I forget where that was now.)
  2. The device drivers for the Arduinos were not installed.  On Macs, there are no drivers to install, but on Windows, you need different drivers for different Arduino boards, and it seems you need to have the board plugged into the USB port in order to install the drivers. (Instructions at
  3. My son had forgotten to include one of the dependencies in his list of what needed to be installed—the Arduino Timer1 module from .

Because they had given me administrator privileges on the machines, my son was able to fix one of the machines to run the data logger code (though he had a couple of minor changes to make in his code for Windows compatibility also).

For the data logger debugging, about all I did was type in my password for him, and once do a search to see where the Arduino code was hidden.

When he has finished debugging and documenting the code, my son will be releasing it with a permissive license on bitbucket, and I’ll be putting links to it here and on the course web page.


2012 October 12

Rapid attrition in grad course

Filed under: Uncategorized — gasstationwithoutpumps @ 21:14
Tags: , , , ,

I’m teaching two graduate courses this quarter.  One is our “how to be a grad student” course, which all the first-year students take. It contains a lot of TA training, group advising, and “soft skills” (LaTeX,BibTeX,  preparing posters, preparing transparencies, oral presentations, voice projection, …).  It has been going fairly well, but I have the first assignment (a LaTeX assignment) to grade this weekend. The other course is the core bioinformatics course for our grad program and is also required of undergrads in bioinformatics (a very small group, since they are required to take 2–3 grad courses).

I’ve had a fair amount of attrition in the bioinformatics core course.  On the first day of class, I had 25 students.  One week later, 21 students turned in the first assignment.  After that was returned, three more students dropped before the second assignment and one has not yet turned in the second assignment, so I have only 17 to grade.  I don’t think I’ve ever had a 32% attrition rate before.  I don’t know much about the four students who dropped before turning anything in—they may have just been shopping for classes and decided that this one would not meet their needs.

The three who dropped and the one who didn’t turn in the second assignment all were suffering from inadequate prior training in programming—some of them didn’t even have the concept of procedures in their repertoire. The prerequisite for the course is about a year of programming courses in a block-structured or object-oriented programming language, but some of the new grads and grad students from other departments had not had that. A couple of them plan to take lower-level programming classes this year, and come back next year to take the bioinformatics course, after they have the fundamentals.  (One has asked to be allowed to sit in on lectures this year, without doing the assignments, so that he gets at least some exposure to the material—I have no trouble with that.)

Last year, while I was on sabbatical, a good friend of mine who had previously taken and TAed the core course taught it, and he introduced a new first assignment, replacing the rather easy warmup assignment I used in the past (a simple FASTA parser) with a much more challenging one that required parsing several different variants of FASTQ.  He provided a scaffold for the students to work from, but it still took two weeks for the students to do the first assignment, and the students and the TA both thought it was too challenging for a first assignment for students who had never programmed in Python before.

I consulted with all the former students (at least all the ones who responded to my request for advice on the compbio mailing list).  Students were divided about the value of the scaffold—some saw it as essential to learning good Pythonic style, while others felt like a simpler first assignment would be a better way to start.  One student recommended that students build their own scaffolds, which struck me as an excellent idea.  So this year, I came up with two new assignments to replace the old first assignment.  The first assignment was to build a scaffold for future programs, and the second was to build a pair of FASTA and FASTQ parsers and conversion programs to convert between fasta+qual file pairs and fastq.  The second program is somewhat more difficult than the first program I used to use, and about the same difficulty as the one used last year.

The scaffolding assignment had a very simple task (read a text file and output the unique words with counts), but I required a particular structure to the program: a generator function that yielded words, an argument parsing function that used argparse, an output function that would print the words in a two-column format with three different sorting options, and a main program (which I provided) that called the other functions.  Most of the students came up with pretty decent implementations (there were some little bits of fluff that I commented on in the feedback), but a few struggled mightily.

I’ll be grading the second assignment this weekend, and I’m looking forward to seeing whether the students have benefitted from the scaffolding assignment—I’m hoping that these programs will be well structured and easy to read.  If they are, I’ll be happy with the addition of the scaffolding assignment.  If not, I’ll have to rethink those assignments for next year.

2012 September 10

A different intro to programming class

Filed under: Uncategorized — gasstationwithoutpumps @ 16:03
Tags: , , , , ,

Michael Ernst describes an intro programming class at the University of Washington that seems ideal for bioinformatics freshmen (and other non-CS majors): PATPAT: Program analysis, the practice and theory: Teaching intro CS and programming by way of scientific data analysis.

It looks better designed than UCSC’s CMPS 5P and BME 60 courses, which are aimed at roughly the same audience.  I like that they managed to get a variety of different types of data analysis done while covering the basics of programming (using Python, which is an excellent choice of languages for such a course).

2012 September 8

On undergrad TAs

Filed under: Uncategorized — gasstationwithoutpumps @ 12:59
Tags: , , , , ,

In Ways of Knowing, Brian Frank discusses his use of undergrad TAs in an intro physics course.  In particular, he has noticed that some of the undergrads have a difficult time with thinking conceptually about physics, though they can choose and solve the appropriate quadratic equations.  For a TA in an algebra-based physics class, the mathematical skill is less useful than the ability to apply and explain the concepts.

The particular example he gives is

A bowling ball is dropped from a height of 45m, taking 3 seconds to hit the ground. How fast is it moving the very moment before it hits the ground? The problem is intended to draw out the following answers and arguments, which we hash out.

10 m/s, because all objects fall at the same rate

15 m/s because you can calculate the velocity as 45m/3s = 15 m/s

30 m/s because it gained 10 m/s in each of the 3 seconds

Other more idiosyncratic answers come up as well, but not with high frequency.

He talks about a TA who solved the problem by applying the appropriate formula for the motion of a falling object and solving the quadratic equation.  This would be a perfectly acceptable approach on a quiz or exam, but does not provide the TA with a good way to explain the concept to students.  (The problem is set up to be amenable to mental math, by rounding g from 9.8 m s–2 to 10 m s–2.)

He found that students who had taken physics from him, rather than from his colleagues, had an easier time grasping the conceptual approaches he was teaching, even if they were matched in their ability to do the computational problems on standard physics exams.

I pointed this post out to my son, since he is planning to volunteer this year as a teaching assistant for a Python class (not for me, but for a different instructor).  The example above is not directly relevant to what he’ll be doing as a TA (Python and physics differ in many ways), but we discussed the general principles.

We came up with the following relevant ideas:

  • Avoid too much detail.  There are often general, powerful methods that seem straightforward to those familiar with them, but which are mystifying to beginners.  Some of the nicest features of Python are not the first things taught, for good reasons.
  • Look for misunderstandings of simple models. It isn’t enough just to be good at the subject—one also has to look for the simple models that can be used to get a beginner past misconceptions.
    Physics education research has been looking at student misconceptions and mental models for decades now, and instructional techniques have been developed and tested specifically for their ability to break down misconceptions and help students build correct mental models.
    Research in programming instruction is still in its infancy, and tends to concentrate more on teaching techniques (like peer instruction and paired programming), order of topics (objects first vs. objects last, recursion before iteration, …), or programming language choice (Pascal, C, Java, Scratch, Python, …) rather than on the mental models and misunderstandings that block individual learning.
  • Pay attention to how the teacher teaches. There will probably not be any Python taught in this intro course for middle and high school students that my son is not already very familiar with.  But that doesn’t mean that there is nothing for him to learn.  Since he grasps programming concepts quickly and intuitively, he has not had to struggle with learning them and may not have an appreciation for the difficult task of getting beginners to grasp concepts that he sees as obvious. Watching how the teacher tries to construct useful (but possibly simplified) models for the students will help him improve his work as a TA.
  • Pay attention to the students’ mistakes. A lot of the work of a teacher or TA is debugging student mental models.  As teachers, we don’t have direct access to the models inside the students’ heads, and so have to infer them from the programs students write.  It is very tempting to debug the student programs for them, showing them how to correct the mistakes in the program, but it is not our goal to produce correct programs for these easy exercises, but to produce students who can generate correct programs.  When students make a mistake, we need to find out why and figure out how to correct the mistake in their mental model, so that they can find and correct the mistake in their program themselves.
    Note: some mistakes are not mental model mistakes, but just typos that are hard to spot, because we see what we intended to put, rather than what is actually there—for that sort of mistake, it is fine for a TA to point to the typo.  I’m talking more about the sort of mistake that results from an incompletely understood concept.

I think that being a TA as a junior in high school will be valuable experience for my son, and we’ll probably have several late-night discussions this year about teaching programming.  I’ll try to record the substance of those discussions in this blog, so that others can chime in and correct any mistaken notions that I inflict on my son.


Next Page »

The Rubric Theme. Blog at


Get every new post delivered to your Inbox.

Join 313 other followers

%d bloggers like this: