Gas station without pumps

2013 March 21

Why Python first?

Filed under: home school,Uncategorized — gasstationwithoutpumps @ 11:21
Tags: , , , , , , ,

On one of the mailing lists I subscribe to, I advocated for teaching Python after Scratch to kids (as I’ve done on this blog: Computer languages for kids), and one parent wanted to know why, and whether they should have used Python rather than Java in the home-school course they were teaching.  Here is my off-the-cuff reply:

Python has many advantages over Java as a first text-based language, but it is hard for me to articulate precisely which differences are the important ones.

One big difference is that Python does not require any declaration of variables. Objects are strongly typed, but names can be attached to any type of object—there is no static typing of variables. Python follows the Smalltalk tradition of “duck typing” (“If it walks like a duck and quacks like a duck, then it is a duck”). That means that operations and functions can be performed on any object that supports the necessary calls—there is no need for a complex class inheritance hierarchy.

Java has a lot of machinery that is really only useful in very large projects (where it may be essential), and this machinery interferes with the initial learning of programming concepts.

Python provides machinery that is particularly useful in small, rapid prototyping projects, which is much closer to the sorts of programming that beginners should start with. Python is in several ways much cleaner than Java (no distinction between primitive types and objects, for example), but there is a price to pay—Python can’t do much compile time optimization or error checking, because the types of objects are not known until the statements are executed. There is no enforcement of information hiding, just programmer conventions, so partitioning a large project into independent modules written by different programmers is more difficult to achieve than in statically typed languages with specified interfaces like Java.

As an example of the support for rapid prototyping, I find the “yield” statement in Python, which permits the easy creation of generator functions, a particularly useful feature for separating input parsing from processing, without having to load everything into memory at once, as is usually taught in early Java courses. Callbacks in Java are far more complicated to program.

Here is a simple example of breaking a file into space-separated words and putting the words into a hash table that counts how often they appear, then prints a list of words sorted by decreasing counts:

def readword(file_object):
    '''This generator yields one word at a time from a file-like object, using the white-space separation defined by split() to define the words.
    '''
    for line in file_object:
        words=line.strip().split()
        for word in words:
             yield word

import sys
count = dict()
for word in readword(sys.stdin):
     count[word] = count.get(word,0) +1
word_list = sorted(count.keys(), key=lambda w:count[w], reverse=True)
for word in word_list:
    print( "{:5d} {}".format(count[word], word) )

Note: there is a slightly better way using Counter instead of dict, and there are slightly more efficient ways to do the sorting—this example was chosen for minimal explanation, not because it was the most Pythonic way to write the code. Note: I typed this directly into the e-mail without testing it, but I then cut-and-pasted it into a file—it seems to work correctly, though I might prefer it if if the sort function used count and then alphabetic ordering to break ties. That can be done with one change:

word_list = sorted(count.keys(), key=lambda w:(-count[w],w))

Doing the same task in Java is certainly possible, but requires more setup, and changing the sort key is probably more effort.

Caveat: my main programming languages are Python and C++ so my knowledge of Java is a bit limited.

Bottom-line: I recommend starting kids with Scratch, then moving to Python when Scratch gets too limiting, and moving to Java only once they need to transition to an environment that requires Java (university courses that assume it, large multi-programmer projects, job, … ). It might be better for a student to learn C before picking up Java, as the need for compile-time type checking is more obvious in C, which is very close to the machine. Most of the objects-first approach to teaching programming can be better taught in Python than in either C or Java. For that matter, it might be better to include a radically different language (like Scheme) before teaching Java.

The approach I used with my son was more haphazard, and he started with various Logo and Lego languages, added Scratch and C before Scheme and then Python.  He’s been programming for about 6 years now, and has only picked up Java this year, through the Art of Problem Solving Java course, which is the only Java-after-Python course I could find for him—most Java courses would have been far too slow-paced for him.  It was still a bit low-level for him, but he found ways to challenge himself by stretching the assigned problems into more complicated ones.  His recreational programming is mostly in Python, but he does some JavaScript for web pages, and he has done a little C++ for Arduino programming (mostly the interrupt routines for the Data Logger code he wrote for me).  I think that his next steps should be more CS theory (he’s just finished an Applied Discrete Math course, and the AoPS programming course covers the basics of data structures, so he’s ready for some serious algorithm analysis), computer architecture (he’s started learning about interrupts on the Arduino, but has not had assembly language yet), and parallel programming (he’s done a little multi-threaded programming with queues for communication for the Data Logger, but has not had much parallel processing theory—Python relies pretty heavily on the global interpreter lock to avoid a lot of race conditions).

2012 May 17

Embedded programming gap

Filed under: Uncategorized — gasstationwithoutpumps @ 20:30
Tags: , , , , , , , ,

It seems that there is a shortage of programmers who can do embedded systems (which is what computer engineering is mostly about these days).

Critics lay much of the blame for the embedded programming gap at the doorstep of university computer science departments that have tended to migrate curricula toward trendy programming languages like Java at the expense of unglamorous tasks such as how to design and analyze algorithms and data structures.
Struggle continues to plug embedded programming gap | EE Times (by George Leopold)

I’m not so sure that Java is at fault here. It seems to me to be perfectly fine second programming language (after a simpler one like Python that does not require all data structures to be declared before any code is written).  The problem is more that the instruction focuses entirely on designing huge complex data structures and using large libraries of complex software components, rather than on fundamentals:

The problems start early in the curriculum. Too often an introductory Computer Science course will fall into one of two extremes: either focusing on the syntactic details of the programming language that is being used—“where the semicolons go”—or else treating programming as a matter of choosing components and plugging them into a GUI.

The basics—how to design and analyze algorithms and data structures—are covered summarily if at all. Software methodologies such as Object Orientation that are best approached in the context of the development life cycle for large applications, are introduced through trivial examples before their benefits can be appreciated: the proverbial cannon that is used to shoot a fly.

The education of embedded systems software engineers: failures and fixes | EE Times (by Robert Dewar)

I’m not so sure that I believe in Robert Dewar’s proposed solution, though, as he suggests having students do more high-level design (software architecture, rather than nuts-and-bolts programming), which is in direct opposition to his claim that students should be getting more training in low-level languages like C.

Robert Dewar also makes an argument for group work at the university level—something that is common in computer engineering programs, but apparently rare in computer science programs.  At UCSC, I know that all computer engineers, electrical engineers, and game design majors are expected to do group senior projects, and some of their other classes (such as mechatronics) are also group projects.

I think that the lack of group projects in many CS courses is not so much tied to Dewar’s idea “a perhaps regrettable staple of the educational process is the need to assess a student’s progress through a grade, and a team project makes this difficult” as it is to the problem of scale—a group project is only reasonable when the project is too big to be done more efficiently by a single person.  Creating and managing such big projects in lower-level classes would be a major undertaking, particularly in the short time frame of a quarter or semester, when a lot of things other than group dynamics need to be learned. Pasting a group structure onto tiny project would make things worse, not better, in terms of training students to be effective in groups (see Group work).

Some entrepreneurs have addressed the problem by starting up “initiatives like Barr’s week-long, hands-on Embedded Software Boot Camp.”  The idea is to take programmers who already have degrees and supposedly know C and train them specifically in the skills needed to do real-time programming. The cost is not small ($3000 for 4.5 days, if you register early).

Some computer scientists have been pointing out problems in the standard CS curriculum for a while:

I started saying this over a decade ago. I even did embedded stuff in my 3rd year data architecture course—my department was uninterested, and the students had a real hard time wrapping their heads around the thought that there are places where resources are limited.

The department fought me when I said that students needed to learn more than one language (Java). The department disagreed when I said that students should learn how to program for environments where bloated OO methods might not work (… But, the ARE no places where efficiency matters!!! It’s all about “Software Engineering”!).

The students had NO idea what it meant to program for a machine that had no disk, only memory.

Part of the reason CS departments are seen as being so out of touch is BECAUSE THEY ARE!!!

University should not be about job training, BUT it is also NOT about teaching only those things the faculty find interesting.

Struggle continues to plug embedded programming gap | The Becker Blog.

I know that there have been struggles between the computer science and computer engineering departments at UCSC about what programming language to teach first, with the computer scientists arguing for Java and the computer engineers arguing for C and assembly language.  Eventually they reached a compromise (which has been stable for about a decade), with the computer science students taught Java first and the computer engineering students taught C first, then both making transitions to the other language.

I think that both approaches work, but the strengths of the resulting programmers are somewhat different.  For device drivers and small embedded systems, I’d bet on the computer engineers, who have a better grasp of low-level details, interrupts, and hardware/software interactions.  For more complicated projects, I’d want one of the computer scientists doing the high-level programming, teamed with computer engineers to do the detail work.

I actually don’t like either C or Java as a first programming language.  I think that students would be better off starting with Scratch, which gets them quickly into multi-threaded code, real time operation, and race conditions, without burdening them with a lot of syntax.  Then they can switch to Python to learn about code syntax, types, and objects, without the burden (or support) of compile-time type checking.  After that they can switch to Java to learn about declaring all their data structures and having very rigid type rules (useful for bigger projects to make interfaces between different design groups more explicit).  In parallel with learning Python and Java, they should learn C and C++ in the context of the Arduino microprocessor (or similar small microprocessor) to control real-time physical systems.

The computer engineers could even skip Java, and learn their rigid type systems strictly in C++ (which they are more likely to use later), though Java is cleaner and makes the learning somewhat easier.

Computer scientists should not stop with this handful of languages, though, as they are essentially all Algol derivatives.  The CS students should also learn a couple of languages that come from different lineages (LISP or Haskell, for example).

2011 October 22

Physics Lab 4: spring constants continued

Filed under: home school — gasstationwithoutpumps @ 11:38
Tags: , , , , , , ,

In Physics Lab 4: spring constants, I assigned a lab exercise to be done while I was away  in Washington, DC, measuring a dozen different springs, getting spring constants for each, and trying to relate the properties of the springs to their geometric properties.  I expected the lab to take up most of the 2-hour time slot, and I was looking forward to doing my own analysis of the data in parallel with the students.

The students did not get the lab completed, because the first hour of the time was spent on one student giving a Vpython tutorial to the student who had not programmed before, which was more urgent than the lab measurements.  In the remaining hour, they got all the geometric measurements made, but force measurements for only 2 of the springs. It turned out that only 11 of the 12 extension springs in the set are measurable with the simple setup of a hook in the wall and an hand-held force gauge with a hook, as one of the extension springs does not have loops on the end for the hooks.

So we’ll finish up the spring lab next week.  If we get the geometric data all properly entered into the computer this week, we may be able to type in the new data as it is measured and do the model building in the same session.

On a different physics-related topic, I’ve had further communication from Doug Brown about the velocity and acceleration computations in Tracker.  He even sent me the routines used and invited me to provide an implementation of my ideas for better velocity and acceleration estimation.  The code is in Java (which I have not learned yet, though it is close enough to C++ to be easy for me to read), but looks like it was written by a Fortran programmer, with all the input parameters packed into an array and manually unpacked at the beginning of the routine and all the outputs similarly packed into an array.  My Mac has Java already installed, but I don’t know whether it is the right version nor whether there are a lot of libraries I’ll have to install (the Tracker download installs everything needed to run Tracker, but I don’t know whether more installations are needed to compile Tracker).

The code has to handle missing position data, which is a bit messy to do well with the algorithm I was considering using, so I’m afraid I’ll need to get the full code from Doug Brown in order to do debugging—I’m unlikely to produce working code on my first attempt at a new language without being able to test it. He also computes the velocity and acceleration independently in different routines directly from the position data, which is not a good decomposition for the algorithm I was thinking of. My algorithm needs to do some decision making about where there are acceleration spikes between samples, and I don’t want to duplicate that code.  It would be simpler to have the 2 derivatives computed in one pass over the data, or to compute the accelerations from the velocities, rather than from the positions.  Either of these approaches would require some refactoring of the Tracker code, and I don’t know how messy that would be.

Learning (a little of) another programming language in order to implement the changes I proposed might delay my starting on the project, and I don’t think I can count on the existing code being good examples of Java programming style. If substantial refactoring would be required as well, I don’t think I’d want to take on this side project.