Gas station without pumps

2011 September 30

Home schooling week 4

Filed under: home school — gasstationwithoutpumps @ 19:11
Tags: , , , ,

Following up on Home Schooling week 3, this post describes our fourth week of home schooling.  We had a meeting mid-week with the consultant teacher to touch bases and check on progress.

Nothing new here.  My son is still unhappy with the poor implementation of the WileyOnline  exercises, but has figured out a workaround for the crummy user interface. (He figured out that sometimes breaking things into syllables requires spaces around the hyphens and sometimes prohibits it—the problem of doing exact string matching when approximate string matching is needed.  He had his first test in class and reported that it was easy, but he likes the instructor, so the class is going well.
English reading
 He finished his short essay on Brave New World and has re-read Left Hand of Darkness. His consultant teacher has approved the length of his BNW essay as appropriate for 10th-grade English, but had some feedback on places it can be improved. It is not clear whether he will do an edit or not.
 He added a few more events to the timeline, but he had a misplaced bracket, and so the web page did not work when he tried to show it to his consultant teacher.  He’ll probably have to put the timeline on-line so that he can share it with the teacher (and with other home-school students).
He got the lead villain role in the play he is doing, which pleases him.  He doesn’t like the script or the character, but he is pleased to have a lot of lines and is going to have fun being the villain in a melodrama. He will have to put a lot of time into running lines, since there are so many this time.  Improv has been fun also.
 We had the first real lab meeting with both students present.  We spent the first half of the time getting the new student up to speed on Vpython (he’s never programmed before) and the second half of the time trying to calibrate the two ultrasonic rangefinders.  They are somewhat touchier to work with than I expected.  (I might do a separate post just on the physics class.)
 They made the top-of-tether box and got it mostly soldered.  We had some discussion of the waterproof connector design.  After a phone conversation with my Dad (a long-retired engineer), I decided that they should probably add strain-relief to the cables entering the box.  I’ll discuss that with them this week.
Machine Learning (Science Fair)
He got an input parser written for the data format, and found that part of the description he was given was incorrect (all the data was positive, not in the range [-5..5]).  He has started coding the first machine-learning algorithm, but not gotten very far yet.
Starts next Tuesday.
Physical Education
The usual 4 hours of bicycling (plus nightly sit-ups and leg lifts), but we had one extra this week.  He got a flat 7.5 miles from home, and couldn’t get the tire off, so rode home on a flat.  We spent some time the next day practicing flat fixing.

Related articles

2011 September 27

Free Breakfast for Bike-To-Work/School

Filed under: Uncategorized — gasstationwithoutpumps @ 18:21
Tags: , , ,

Twice a year, free breakfasts are provided to bicycle commuters in Santa Cruz County at Bike-to-Work/School events.  The fall event is coming up soon: Thursday 6 October 2011.  There are 16 public sites (3 on the UCSC campus, 4 if you count Long Marine Lab as well) and 41 school sites listed on Bike to Work – Santa Cruz Free Breakfast Sites.  Breakfast will be served to bicycle commuters 6:30 a.m.–9:30 a.m. at the public sites (the school sites are usually shorter hours, most often the half hour or hour immediately before classes start).

Some of the site have extra features (like free bicycle maintenance, free massages, free acupuncture, …)—see the list and map.

For those who want to get started a day earlier, October 5 is “International Walk to School in the USA” day: So far as I know, nothing special is being done on Oct. 5 in Santa Cruz County, but I think that several of the Bike-to-School sites will give breakfast on Oct. 6 to students who walk to school as well as those who bike—check with your school.

Incidentally, I have no idea why “International Walk to School in the USA” is not “International Walk to School” or “Walk to School in the USA”. Perhaps they are promoting crossing the border on foot? Or they think that only international students will walk? sticklers for the $50 multiple-image surcharge put a hold on my second order from them, asking for an extra $50 (bringing the cost up to $100.30) for the new board, because there were multiple images.

I asked them on the phone what the rule was now for defining multiple images.  Although they did not give an exact algorithm, basically they are looking for separate pieces unconnected by copper traces.  I suspect that they either do a quick look at the board manually or have some very crude program to detect probable repeats.

They are not looking just for exact repeats (which is the rule that Gabriel Elkaim told me they used), but anything that looks like multiple boards. Now I have to decide whether to pay the $50 surcharge, find another vendor, or redesign the board.

For the tiny breakout boards I may be better off with a order, at $2.50/sq in + $10 setup. I had 3 (slightly different) copies of the hexmotor 2.3 board, plus 2 each of 3 different breakout board designs for the pressure sensor.

The areas and prices for the boards (not including shipping) are

board dimensions area price
pressure 1.3 0.7″ × 1.45″ 1.015 sq in $15.08 for 2
pressure 1.4 1.5″ × 1.0″ 1.5 sq in $17.50 for 2
pressure 1.5 0.475″ × 0.8″ 0.38 sq in $15.00 for 2
hexmotor 2.3 3.95″ × 3.15″ 12.44 sq in $41.11 for 1, $103.32 for 3
combined 7.537″ × 6.737″ 50.78 sq in $136.94 for 1

So if I keep the original order, I’m better off with pricing, but if I get only 1 motor controller board, I could reduce the price to about $75+shipping (with a new combined board, to avoid the repeated $10 setup).  If I just want the breakout boards, I could do all of them without the hexmotor board as 2 copies of  a combined 1″ × 3.825″ board  for $29.13+shipping.

UPDATE: I decided I want the new features of the hexmotor 2.3 board enough to spend the extra $60 rather than just using the old hexmotor 1.3 board.  When I want tiny boards, though, I’ll try, which will be cheaper for boards smaller than about 13 square inches (depending on shipping and how many are ordered).

2011 September 26

Texas planning to shut down physics departments

Filed under: Uncategorized — gasstationwithoutpumps @ 11:41
Tags: , , ,

In a Nature News post, Texas holds firm on physics closures , I found out that Texas is planning to shut down physics departments that don’t graduate lots of undergrads.  This seems a bit demented to me, since physics departments serve two primary roles: as research groups and as service education for students in engineering and science majors.  The physics undergrads they produce are a by-product, and they should only be producing enough to keep the grad pipeline full.

Who will teach physics to all the other science and engineering majors if the physics departments are shut down?  Or do they plan to keep the departments but just remove the ability of students at those schools to get degrees in physics?  Who does that help?

Given that the programs that are intended to be shut down are in the schools that serve more minority students, this sounds like racist politics rather than any sort of sound academic planning.  It may be advisable at some schools to eliminate upper-division physics (and hence the physics BS), but the monetary savings is low and the effect on faculty morale and both faculty and student retention high.  This sort of planning needs to be done on a case-by-case basis at each school, not mandated statewide by people who have only a hazy idea what is involved in academic planning.

2011 September 24

On Theoretical/Computational Sciences

GMP (which is short for GeekMommyProf, not Gay Male Prof, as I initially thought), put out a call for blog entries in her blog: Academic Jungle: Call for Entries: Carnival on Theoretical/Computational Sciences, asking computational and theoretical scientists to write “a little bit about what your work entails, what you enjoy/dislike, what types of problems you tackle, what made you chose your specialization, etc.” (Deadline Sunday Sept 25)

As a bioinformatician, I certainly do computational work, but probably not in the sense that GMP means.  My computational work is not theoretical work, nor modeling physical processes using computational models.  Bioinformatics is not a model-driven field the way physics is.  (In fact, the very lumping together of theoretical and computational work implies to me that GMP is a physicist or chemist, though her blogger profile is deliberately vague about the field.)

To give an example of data-driven computational work that is not model-based, consider the example of predicting the secondary structure of a protein.  In the simplest form, we are trying to predict for each amino acid in a protein whether it is part of an α-helix, a β-strand, or neither.  The best prediction methods are not mechanistic models based on physics or chemistry (those have been terribly unsuccessful—not much better than chance performance).  Instead, machine learning methods based on neural nets or random forests are trained on thousands of proteins.  These classifiers get fairly high accuracy, without having anything in them remotely resembling an explanation for how they work. (Actually, almost all the different machine-learning methods have been applied to this classification problem, but neural nets and random forests have had the best performance, perhaps because those methods rely least on any prior understanding of the underlying phenomena.)

It is rare in bioinformatics that we get to build models that explain how things work.  Instead we rely on measuring the predictive accuracy of “black-box” predictors, where we can control the inputs and look at the outputs, but the workings inside are not accessible.

We judge the models based on what fraction of the answers they get right on real data, but we have to be very careful to keep training data and testing data disjoint. It is easy to get perfect accuracy on examples you’ve already seen, but that tells us nothing about how the method would perform on genuinely new examples.  Physicists routinely have such confidence in the correctness of their models that they set their parameters using all the data—something that would be regarded as tantamount to fraud in bioinformatics.

Biological data is not randomly sampled (the usual assumption made in almost all machine-learning theory).  Instead, we have huge sample biases due to biologists researching what interests the community or due to limitations of experimental techniques.  As an example of intensity of interest, of 15 million protein sequences in the “non-redundant” protein database, 490,000 (3%) are from HIV virus strains, though there are only a dozen HIV proteins.  As an example of limitations of experiments, there are fewer than 900 structures of membrane proteins, out of the 70,000 structures in PDB, even though membrane proteins make up 25–35% of proteins in most genomes.  Of those membrane protein structures, there are only about 300 distinct proteins (PDB has a  lot of “redundancy”—different structures of the same protein under slightly different conditions).

One of the clearest signs of a “newbie” paper in bioinformatics is insufficient attention to making sure that the cross-validation tests are clean.  It is necessary to remove duplicate and near-duplicate data, or at least ensure that the training data and the testing data have no close pairs.  Otherwise the results of the cross-validation experiment do not provide any information about how well the method would work on data that has not been seen before, which is the whole point of doing cross-validation testing.  Whenever I see a paper that gets astonishingly good results, the first thing I do is to check how they created their testing data—almost always I find that they have neglected to take the standard precautions against fooling themselves.

I came into bioinformatics through a rather indirect route: B.S. and M.S. in mathematics, then Ph.D. in computer science, then 10 years as a computer engineering professor, teaching VLSI design and doing logic minimization.  I left the field of logic minimization, because it was dominated by one research group, and all the papers in the field were edited and reviewed by members of that research group.  I thought I could compete on quality of research, but when they started rejecting my papers and later publishing parts of them under their own names, I knew I had to leave the field (which, thanks largely to their driving out other research groups, has made almost no progress in the 15 years since).  While I was looking for a new field, I happened to have an office next to a computer scientist who was just moving from machine learning to bioinformatics.  I joined him in that switch and we founded a new department.

I had to learn a lot of new material to work in my new field (Bayesian statistics, machine learning, basic biology, biochemistry, protein structure, … ).  I ended up taking courses (some undergrad, but mostly grad courses) in all those fields.  Unlike some other faculty trying to switch fields, I had no grant funding for time off to learn the new field, nor any reduction in my teaching load.  Since I have hit a dry spell in funding, I am on sabbatical this year, looking at switching fields again (still within bioinformatics, though).

If I were to do my training and career over again, I would not change much—I would switch fields out of logic minimization sooner perhaps (I hung on until I got tenure, despite the clear message that the dominant group felt that the field wasn’t big enough for competitors), and I would take more science classes as an undergrad (as math major, I didn’t need to take science, and I didn’t).  I would also take more engineering classes, both as an undergrad and grad student. I’d also take statistics much sooner (neither math nor computer science students routinely take statistics, and both should).  I’d also up the number of courses I took as a professor, to a steady one a year.  (I was doing that for a while, but for the past few years I’ve stopped, for no good reason.)


Next Page »

%d bloggers like this: