I’ve been wondering what an Ion Torrent sequencer is useful for. I mainly deal with de novo assembly of genomes, which needs a lot more data than an Ion Torrent sequencer provides, even for assembling bacterial genomes. The high error rate and relatively short read length of the Ion Torrent reads is also a problem. For de-novo sequencing, almost everyone is going with the Illumina platform, which provides barely long enough reads (a little over 100-long at each end of a pair) at the lowest cost. I like to have some longer 454 reads to throw into the mix, but they are more expensive and confuse some of the de Bruijn graph assemblers.
This past week, though, I’ve been working on a problem that might be ideal for the Ion Torrent: assembling the mitochondrial sequence of the banana slug, Ariolimax dolichophallus. I’ve been trying to assemble it from 10x whole-genome shotgun sequence using Illumina reads (with paired ends that were too close together, so many of the reads overlap in the middle). The library prep looks like it was very good at excluding mitochondria: the mitochondrial genomes seem to have little more coverage than the nuclear genome. [Correction: I must have dropped a decimal point somewhere—the coverage is indeed much higher for the mitochondrion: more like 200x than 10x.]
Since mitochondrial genomes are the primary way of identifying eukaryotic species (often using only a tiny snippet, the “barcode of life“), there is a lot of value in being able to determine the genome quickly and cheaply. A mitochondrial genome is much shorter than bacterial genomes (only about 15 kbases), which makes the low coverage, short reads, and high error rates not much oof a problem. If you have over 100x coverage on a short genome, you can still align and assemble it despite the noise, especially since repeats are not a problem in mitochondria.
Isolating mitochondrial DNA is also supposed to be relatively easy, so it might be good for Ion Torrent to put out a “mitochondrial genome kit” that makes isolating the mitochondrial DNA, sequencing it, and assembling the resulting genome very cheap. This would take the rather thin taxonomic sampling of 2654 mitochondrial genomes at NCBI to hundreds of thousands in just a few years. The key thing is to make the library prep very cheap and simple, since otherwise one could do barcoding to multiplex samples and piggyback on sequencing runs on the larger batch machines.