Gas station without pumps

2010 June 10

What is Bioinformatics?

Filed under: Uncategorized — gasstationwithoutpumps @ 11:58

My field of research and teaching is Bioinformatics, and the first question most people ask when I tell them what I do is

“What is bioinformatics?”

I usually give an answer something like the following:

Bioinformatics is the use of computers and statistics to make sense out of the huge mounds of data that are accumulating from high-throughput biological and chemical experiments, such as sequencing of whole genomes, DNA microarray chips, RNA-seq, ChIP-seq, two-hybrid experiments, and tandem mass spectrometry.

There are three different approaches to bioinformatics:

  • Tool Building—Creating new programs and methods for analyzing and organizing data. This is where our graduate program is focused, as we are interested in cutting-edge research, where the tools simply don’t exist yet.
  • Tool Using—Using existing programs and data to answer biologically interesting questions. I believe that this type of bioinformatics does not need a separate degree, but should be part of every new biologist’s training.  Our program is set up for bioinformatics minors to get this sort of training.  Our BS and MS students are also prepared for this sort of work, but the focus of our program is more on creating tool builders.  Personally, I think that there is much more need for tool-users than for tool-builders.  The situation is much like that in statistics, where almost everyone needs to know some statistics, we need a fairly large number of statisticians with a bit deeper understanding doing fairly routine work, and only a small fraction of statisticians are developing new tools and advancing the field.
  • Tool Maintenance—Setting up databases, creating web sites, translating biologists’ questions into ones that programs can answer, keeping the tools working and the databases up to date. Our undergraduate program prepares students for this role in industry, as well as for going into graduate school. Our expectation is that the more research-focused students will go on for higher degrees.  A lot of the “maintenance” work can be done by students having almost no bioinformatics training, if they have sufficient database and web experience.  The main advantage of hiring a bioinformatics major is that they will have better understanding of the biology and an easier time communicating with biologists.  They are also more likely to catch errors that result in ridiculous results than someone with just computer training or just biology training.

What is the difference between Bioinformatics and Computational Biology?

I consider “bioinformatics” and “computational biology” to be essentially synonymous, but some people make a distinction between two flavors of bioinformatics: tool and method development (bioinformatics) and applying existing tools to new biological questions (computational biology).  There is a good defense of this distinction by Russ Altman. I feel that the best work results from people who do both styles: developing new methods and applying them to new biological questions.  One interesting thing about bioinformatics is that the fundamental work that opens up new fields is usually “engineering”, while the application of the tools is “science”.  This paradigm of engineering-preceding-science is actually quite common, but clashes with the popular meme that science precedes engineering.


  1. … and what about computer software that evolves without involvement of its master (programmer). Would you prefer to study this by bioinformatics too?

    Comment by Paul N. — 2010 June 11 @ 00:41 | Reply

    • No, artificial evolution of programs has little or nothing to do with bioinformatics. A big chunk of bioinformatics has to do with the specific representations and mechanisms used by living organisms (details of the RNA, DNA, proteins, and so forth). Another big chunk has to do with learning the error modes of the lab techniques we have for studying these molecules. The algorithmic and data representation stuff that doesn’t involve detailed information about the data is shared with several other fields (databases, information theory, Bayesian statistics, machine learning, …). Those subjects may well be relevant to artificial evolution of programs, but “bio” is an essential part of “bioinformatics”.

      Comment by gasstationwithoutpumps — 2010 June 11 @ 07:41 | Reply

  2. […] tools into introductory biology courses. The draft of the proposed changes has no mention of bioinformatics and “comput*” finds only 3 instances on 63 pages.  So the AP bio course is still […]

    Pingback by Advanced Placement Bio changes announced « Gas station without pumps — 2011 January 8 @ 10:19 | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: