Gas station without pumps

2015 September 1

Pedagogy for bioinformatics teaching

Filed under: Circuits course — gasstationwithoutpumps @ 10:48
Tags: , , , , ,

I was complaining recently about the dearth of teaching blogs in my field(s), and serendipitously almost immediately afterwards, I read a post by lexnederbragt Active learning strategies for bioinformatics teaching:

The more I read about how active learning techniques improve student learning, the more I am inclined to try out such techniques in my own teaching and training.

I attended the third week of Titus Brown’s “NGS Analysis Workshop”. This third week entailed, as one of the participants put it, ‘the bleeding edge of bioinformatics analysis taught by Software Carpentry instructors’ and was a unique opportunity to both learn different analysis techniques, try out new instruction material, as well as experience different instructors and their way of teaching. …

I demonstrated some of my teaching and was asked by one of the students for references for the different active learning approaches I used. Rather then just emailing her, I decided to put these in this blog post.

It is good to see someone blogging about teaching bioinformatics—there aren’t many of us doing it, and most of us are more focused on research than on our pedagogical techniques.  For that matter, in my bioinformatics courses, I’ve only been making minor tweaks to my teaching techniques—increasing wait time after asking questions, randomizing cold calls better, being more aware of the buildup of clutter on the whiteboard, … .  Where I’ve been focusing my pedagogic attention is on my applied electronics course and (to a lesser extent) the freshman design seminar.

I’ll be starting my main bioinformatics course in just over 3 weeks, a first-quarter graduate course that is also taken by seniors doing a BS in bioinformatics.  This will be the 14th time I’ve taught the course (every year since 2001, except for one year when I took a full-year sabbatical).  Although the course has evolved somewhat over that time, it is difficult for me to make major changes to something I’ve taught so often—I’ve already knocked off most of the rough edges, so major changes will always seem inferior, even if they would end up being better after a year or two of tweaking.  I think that major changes in the course would require a change of instructor—something that will have to be planned for, as I’ll be retiring in a few years.

My main goals in this core bioinformatics course are to teach some stochastic modeling (particularly the importance of good null models), dynamic programming (via Smith-Waterman alignment), hidden Markov models, and some Python programming.  The course is pretty intense (the Python programming assignments take up a lot of time), but I think it sets the students up well for the subsequent course in computational genomics (which I do not teach) and for general bioinformatics programming in their research labs. I don’t cover de Bruijn graphs or assembly in this course—those are covered in subsequent courses, though both the exercises Lex mentions seem useful for a course that covers genome assembly.

The live-coding approach that Lex mentions in his blog seems more appropriate for an undergrad course than for a grad course.  I do use that approach for teaching gnuplot in my applied electronics course, though I’ve had trouble getting students to bring their data sets and laptops to class to work on their own plots for the gnuplot classes—I’ll have to emphasize that expectation next spring.

It might be possible to use a live-coding approach near the beginning of the quarter in the bioinformatics course—on the first assignment when I’m trying to get students to learn the “yield” statement for make generators for input parsing. I’ve been thinking that a partial worked example would help students get started on the first program, so I could try live coding half the assignment, and having them finish it for their first homework.

One of the really nice things about Python is how easily one can create input handlers that spit out one item at a time and how cleanly one can interface them to one-pass algorithms. Way too many of the students have only done programming in a paradigm that reads all input, does all processing, and prints all output.  Although there are some bioinformatics programs that need to work that way, most bioinformatics tasks involve too much data for that paradigm, and programs need to process data on the fly, without storing it all.  Getting students to cleanly separate I/O from processing while processing only one item at time is the primary goal of the first two “warmup” Python programs in the course.

One thing I will have to demonstrate in doing the live coding is writing the docstring before writing any of the code for a routine.  Students (and professional programmers) have a tendency to code first and document later, which often turns into code-first-think-later, resulting in unreadable, undebuggable code. I should probably make a bigger point of document-first coding in the gnuplot instruction also, though the level of commenting needed in gnuplot is not huge (plot scripts tend to be fairly simple programs).

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: