I’m up for a review for a salary increase this year (it has been 4 years since the last time), and I have to get all the paperwork filed by Friday. I spent most of today getting my c.v. updated and getting the files properly uploaded to the DivData site (I don’t think that the bureaucrats have any idea how long it takes to get everything formatted the way they want it, copying information from a variety of sources).
I was going to spend today writing my “personal statement” about teaching, research, and service, but the bureaucratic part of the paperwork ate up all my time. I also found out that I apparently did not file a sabbatical leave report after my last sabbatical (I thought I had, but I can’t find one on any of my computers, not even after half an hour of searching through thousands of e-mail messages), so I have to write that leave report now, trying to remember what I did over 3 years ago. Luckily I had already started blogging then, so I could go through the year’s blog posts and pick out some of the more salient activities.
Here then is a draft of my sabbatical leave report for September 2011–June 2012:
Although I started the sabbatical with a short trip to Italy to give some invited talks, I decided to spend most of time in Santa Cruz, picking a new direction for my research and my career as a professor. I had been beating my head against the wall for several years, trying to get funding for my protein-structure prediction research, but I never seemed to hit the sweet spot.
My protein-structure prediction methods were doing well in the CASP contests, but my funding proposals were always turned down. I saw protein-structure prediction as a field where the low-hanging fruit had been picked, and what remained was hard work to get small improvements—but that is not what NIH wanted to hear. They preferred giving funding to people who promised silver bullets and delivered nothing.
The protein-structure field was getting crowded, with many more groups chasing funds than could possibly be funded, and while I was willing to work on some of the hard problems that could lead to incremental improvements, I wasn’t willing to do it without funding—the problems were not that exciting any more.
The funding situation in bioinformatics is generally fairly bleak for individual investigators, as most of the funding agencies prefer to issue huge grants to large teams of researchers backed by professional grant writers. That is not my style of working, and I’m very slow at writing grant proposals, so I decided that I would be at least as productive if I didn’t write any more research proposals, but put the time into doing research instead. I wouldn’t be able to have grad students working with me any more, unless I was collaborating with someone else who was paying them, but I’ve never relied that much on grad student labor for my research.
Fairly early in my sabbatical I decided to leave both protein-structure prediction and grant writing, but I spent the rest of my sabbatical figuring out what I was moving towards, rather than what I was moving away from.
I spent a lot of my sabbatical learning new things and playing with ideas that were too small for funding agencies to be interested in.
For example, just before my sabbatical, I had taught a Banana Slug Genomics course, where we had attempted to assemble the genome for Ariolimax dolichophallus, though we did not have enough data to really do much with. I decided to try to assemble the mitochondrial genome, since there was more coverage of the mitochondrion in the sequence data that we had and the mitochondrial genome is short enough that hand effort on it could be successful. I spent several weeks teasing out as much as I could from the data we had, eventually creating an assembly that closed the genome into a circle, though there was one region that I was very uncertain about—I put in several repeats in that part of the assembly, though I regarded population variation in the mitochondria as being as likely an explanation for the data we had. I kept all my notes on the assembly process on the class Wiki, now on the web page https://banana-slug.soe.ucsc.edu/assemblies:2011:mitochondrion_assembly
In 2015, we did another Banana Slug Genomics class with much more data, and got a mitochondrial genome in two contigs that pretty much agreed with the assembly that I had pieced together out of much less data. The region where I was most dubious about my assembly was at the boundary of one of the contigs, so could not be resolved in the new assembly. One possible explanation is that there are indeed repeats there, and that the shotgun assembly can’t resolve the repeats. A student is now trying to do PCR to close the gaps and see what is really there—preliminary evidence suggests that there might even be something interesting—the banana slug mitochondrial genome might be in two chromosomes, rather than one, though that conjecture is based on a single PCR experiment that resulted in a PCR product that couldn’t be Sanger sequenced, raising some questions about the correctness of the PCR.
Another thing I learned during my sabbatical was to do printed-circuit board design using Eagle layout software. Initially I was just doing this as a hobby, but I have since used custom PC boards of my own design as prototyping boards for the students in BME 101/L, Applied Electronics for Bioengineers (that’s the current number and title for the course—it has had different numbers and titles in different quarters).
I also taught myself (and my son) calculus-based physics during my sabbatical—I’d only had algebra-based physics previously, which I saw as a hole in my education. We used the textbook Matter and Interactions, which has the students writing simple Python programs to simulate physical systems, rather then relying only on simple situations for which closed-form solutions are easily derived. The programming approach to learning physics was a good fit for both my son and me, as was the introduction of momentum and general relativity first, with simplification to Newtonian mechanics introduced as a modeling approximation.
In the process of teaching/learning physics, I developed several low-cost physics lab experiments (some of which were successful, some complete failures), and posted about them on my blog: https://gasstationwithoutpumps.wordpress.com/physics-posts-in-forward-order/ Some of the experiments involved using an Arduino microcontroller board to gather data, which my son later developed into an Arduino data logger, that I used in the first offering of BME 101, my Applied Electronics course. That has since developed further into the open-source PteroDAQ data acquisition system, maintained by my son and me at https://bitbucket.org/abe_k/pterodaq/. I have been using PteroDAQ extensively in the labs for BME 101.
Also as a result of the self-taught physics course, I made improvements to the widely used Tracker tool used in physics classes for analyzing video data, to make the tracking handle bounces better. I posted information about the changes on my blog https://gasstationwithoutpumps.wordpress.com/2011/11/08/tracker-video-analysis-tool-fixes/, and I provided the changes to the developers of the tool (who happen to be at Cabrillo College), who acceptd the changes into the code base.
I spent one day working on a Python program to fit a sphere to a number of data points (to determine the offset for a magnetometer that my son was using in a robotics project), and posted the method I used on my blog: https://gasstationwithoutpumps.wordpress.com/2012/04/21/fitting-a-sphere/ The code was released as open-source http://users.soe.ucsc.edu/~karplus/FitHypersphere, and was later used by at least one other user, who commented on the blog post.
In toying with the idea of going back to doing research on digital music synthesis, which I had not done much in since about 1984, I posted on my blog about how to create minimal WAV-format files in C, which has since become one of my most searched-for posts, as the simple code is good for incorporating into programming courses using digital media as a theme: https://gasstationwithoutpumps.wordpress.com/2011/10/08/making-wav-files-from-c-programs/
Of course, a big chunk of my sabbatical time was spent on bioinformatics work—mainly on trying to reconstruct the cagY gene from a strain of Helicobacter pylori that had been sequenced at UCSC. The cagY gene is almost impossible to assemble from shotgun sequencing data, because it has long internal tandem repeats that exceed the read length of the sequencers. We got early access to PacBio long reads and I found that I could assemble the gene from the noisy long reads, even with only modest coverage. I blogged about my techniques: https://gasstationwithoutpumps.wordpress.com/2012/04/13/working-with-other-peoples-data/ and https://gasstationwithoutpumps.wordpress.com/2012/04/15/reconstructing-genes-from-pacbio-reads/ Although I did not publish the work formally, I shared it with the collaborators at PacBio, who have since developed their own tools based in part on these ideas. The use of PacBio long reads for assembling prokaryotic genomes is now fairly standard, supplanting short-read shotgun sequencing for small genomes. I’ve continued to do a little work on the H. pylori genome sporadically since then, and we plan to publish annotated sequences for two closely related strains soon, with some comparative genomics about the evolution of one from the other.
I spent a lot of my sabbatical thinking about teaching, reading a lot on pedagogy (mainly from teacher blogs, rather than from academic literature), and reflecting both on what I read and on my own teaching experience. I made quire a few blog posts with those reflections—here is a sampling of them:
I collected a number of resources for teaching bioinformatics to high-school students, https://gasstationwithoutpumps.wordpress.com/2011/07/23/resources-for-bioinformatics-in-ap-bio/, which lead to a project with bioinformatics grad students for the next two years to teach bioinformatics lessons at Pacific Collegiate High School.
I spent some time working on potential revisions to the grad curriculum for the department, and updating the frequently-asked-questions page for the grad program. These changes mostly did not get made, because I did not get sufficient consensus from the other faculty in the department. Eventually I turned the grad directorship over to a younger faculty member, who had very different ideas about the direction the grad program should take.
As my sabbatical ended in June 2012, my teaching load for 2012–13 changed. Because I no longer was scheduled to do an introductory programming course for biologists as an overload, I decided to take on the creation of a new course (also as an overload). The one I saw the most need for in the bioengineering curriculum was an applied electronics course, so I spent most of the summer and fall of 2012 teaching myself analog electronics and designing a course for bioengineers: https://gasstationwithoutpumps.wordpress.com/circuits-course-table-of-contents/
That course has since become one my main pedagogic and scholarly pursuits. The lab handouts that I developed for the course have morphed into a textbook, which is available in draft form https://leanpub.com/applied_electronics_for_bioengineers, with updates being released to purchasers of the draft every few weeks. Further developing the PteroDAQ software that my son created to help me teach the course has also started to take up a fair amount of my time.
My sabbatical helped me to refocus my career on teaching and curriculum development, as a more productive use of my time than writing futile grant proposals. I continue to do some research as a collaborator on other people’s projects, but I’m focussing more now on the University’s teaching mission than its research mission.