Archive by Author

Periodic Table of Bioinformatics

periodic_tableI came across this interesting “Elements of Bioinformatics” that categorizes and arranges bioinformatics tools in the format of a periodic table. It is certainly not exhaustive, but it is very useful (and fun!) as an overview of the available tools. In addition, you also find the year the tool was published at the upper right corner, so the table also offers a historical perspective of bioinformatics!

Check out

Ion Torrent or Illumina?

If you are choosing between Illumina and Ion Torrent for sequencing of your genome of interest, a study published recently by the Broad Institute (Ross et al 2013 Genome Biology 14:R51) may be of interest to you.

The study compares Illumina MiSeq, Ion Torrent PGM and PacBio on sequencing bias in regions with extreme GS content (<10% and >75%) or long AT dinucleotides in three different bacterial genomes. Relative coverage by each technology is lower in all of these difficult regions, but coverage bias was found to be the most pronounced in Ion Torrent PGM data. PacBio demonstrates the least coverage bias, likely because of its amplication-free protocol, but a much higher error rate than the other two platforms was observed. The results are consistent with an earlier study that also compared those same sequencing platforms (Quail et al 2012 BMC Genomics. 13: 341).

Therefore, depending on the characteristics of your genome of interest, your choice of sequencing platform will influence your downstream analyses.

De Novo Transcriptome of a Model Organism to Study Tissue Regeneration

Newts have an extraordinary ability to regenerate tissues. For example, they can re-grow fully functional limbs after amputation. In addition, regeneration of parts of the central nervous system, the heart, and the lens has been characterized, making them an excellent model organism for studying regenerative processes. However, because of their enormous genome size (10 times that of human), the molecular mechanisms behind this amazing regenerative process are largely unknown.

A research group at the Max Plank Institute recently published a de novo assembly of the transcriptome of the urodelian amphibian Notophthalmus viridescens (Looso M. et al. ). The researchers combined 454, Illumina, and Sanger sequencing data from both normalized and non-normalized cDNA libraries. The resulted transcriptome comprises over 120,000 non-redundant transcripts. Homology search using BLAST led to annotation of 38,000 transcripts. Importantly, they found 800 transcripts, whose protein-coding potential was validated by mass spectrometry, that show no similarity to any know transcripts or show similarity to urodele-specific EST sequences. Some of these transcripts belong to novel protein families.

It is an interesting hypothesis that some of those newt-specific proteins may provide mechanistic insights into regeneration processes unique to these animals. Their work will definitely be an important resource for subsequent studies in tissue regeneration and may benefit future research in regenerative medicine.

A Method to Increase Accuracy in Next Generation Sequencing

B) Duplex Sequencing workflow. Sheared, T-tailed double-stranded DNA is ligated to A-tailed adapters. Because every adapter contains<br />a Duplex Tag on each end, every DNA fragment becomes labeled with two distinct tag sequences (arbitrarily designated α and β in the single fragment shown).<br />PCR amplificationwith primers containing Illumina flow-cell–compatible tails is carried out to generate families of PCR duplicates. Two types of PCR products are<br />produced from each DNA fragment. Those derived from one strand will have the α tag sequence adjacent to flow cell sequence 1 and the β tag sequence<br />adjacent to flow cell sequence 2. PCR products originating from the complementary strand are labeled reciprocally.

Next-generation sequencing allows detection of minor variants in a heterogeneous sample. However, errors in PCR and sequencing pose limits on its sensitivity.
A group at University of Washington developed a method, called Duplex Sequencing, to dramatically improve accuracy by sequencing both strands of each DNA duplex. Mutations that are detected in the consensus sequence of one strand but not the other are discounted as technical errors.

The authors adopted the method to Illumina sequencing. It involves the use of modified adaptors that have a tag with random sequence attached. After ligation of these modified adaptors, each duplex DNA fragment is flanked by two different tags and subjected to paired-end sequencing. Sequences of the same duplex from the complementary strands can therefore be uniquely identified by having the same tags on either ends. Comparing sequences of the two strands allows identification of true mutations. The authors estimated that Duplex sequencing has a theoretical background error rate of less than one per 109 nucleotides sequenced.
Full text article can be accessed here:


The first human genome exhibit at The Smithsonian Institution

Last week The Smithsonian Institution’s National Museum of Natural History at Washington DC announced a new exhibit to celebrate the 10th anniversary of the completion of the human genome. The project is a collaboration between the museum and the National Human Genome Research Institute, with major funding coming from the Life Technologies Foundation. It will open in 2013 to the 7 million annual visitors of the museum.


“The goal of the exhibition is not just to celebrate but to look ahead and acknowledge that we are in the early stages of a very exciting genomic era, that we have learned a remarkable amount about how the genome works and how it contributes to health and disease, and that the pace of research is only accelerating and becoming increasingly relevant to people,” said NHGRI Director Eric Green.
Also announced last week was a new grant program by the NHGRI to study newborn genome sequencing. It will provide $25 million to study how whole genome or whole exome sequencing will benefit newborn care as well as its social implications.
“Genome”, “genomics” etc used to be terms understood by few outside of biology and bioinformatics. This is changing rapidly. It is exciting time ahead of us to witness the genomics revolution.

A Documentary on the Use of Sequencing Technologies in Medicine

A documentary titled “Cracking Your Genetic Code” was recently released, and it offers a glimpse on how genomics is transforming medicine. Prominent scientists are featured, including Francis Collins from the National Institute of Health and Eric Lander from the Massachusetts Institute of Technology. In the documentary we are introduced to technologies for sequencing the entire genome; Illumina is being mentioned as one of the companies with such technology. We hear real-life stories where genome sequencing and genotyping led to diagnoses and successful treatments that would not happen otherwise.

The documentary does not only present the promises that personalized medicine is bringing, it also raises important questions concerning the readiness of the society in adopting this new form of medicine. There are always pros and cons with introducing new technologies to our daily lives. It is a matter of engaging and educating the public so the society as a whole can make some informed decisions during this healthcare revolution. Cracking your Genetic Code is a great introduction to the new role of sequencing in medicine, and I hope you will share it with your colleagues and friends!