Unexpected Heroes

Image courtesy of FreeDigitalPhotos.netThere are several mutations known which are linked to childhood diseases. This knowledge is already being used e.g. to analyze genomes of sick newborns for any known diseases, or for prenatal diagnostics. However: A person carrying such a mutation must not necessarily get ill.

Some individuals carry a mutation that should have caused a severe disease in their childhood. However, some yet unknown factors have protected them from getting ill. Even though they may be very rare, studying such persons may help to understand more about the diseases, or even find new treatments.

Researchers of the “Resilience Project” are now looking for such individuals who they call “unexpected heroes”: Adults who are “resilient to a certain rare disease despite carrying genetic mutations that would indicate onset of the disease in childhood.” In order to find those rare individuals, they are asking for volunteers to donate DNA samples for the project. Since they expect only 1 of 20,000 individuals to be such an “unexpected hero”, they need to analyze the genomes of more than 100,000 individuals. Participants can register online and will receive a test kit by mail. In return, the volunteers get a report indicating whether any of the analyzed mutations have been found in his or her genome.

The researchers hope to identify genes that can “buffer” the effects of the mutations, as well as environmental factors which help people carrying the mutations to stay healthy. The goal is to find new treatments, or even prevent people from getting ill at all.


Note: NGS in Diagnostic Testing

Yes, this amazing technology is not just a tool for basic researcher anymore, but has made its way in to the clinical routine testing. It currently all about exome sequencing and targeted gene panel analysis, but whole genome sequencing is expected to come into clinical routine soon. Have a read through this comprehensive article which describes very nicely which applications are suitable for the diagnostic testing and which may come in the future.

Read the article about NGS in diagnostic testing

Why is Illumina so successful? Watch an interview with Illuminas CEO

In the 2nd quarter of 2014 Illumina reported adjusted earnings of 57% per share – most probably the biggest increase in the companies history. Watch this interview with the CEO of Illumina, Jay Flateley, to learn more about the reasons of Illumina’s success.


Epigenetic study confirms: Tobacco addiction during pregnancy

Courtesy of FreeDigitalPhotos.netIn the morning paper I found a very interesting article from Kathrin Zinkant about smoking during pregnancy (Sueddeutsche Zeitung, Wissen, July 31 2014). It is long known that smoking during pregnancy is taboo. However, estimated 5% – 10% of pregnant women in Germany still smoke, many of them because they are not aware of the pregnancy in the first trimester. Tobacco toxins can harm significantly. Known consequences are reduced weight at birth, damaged lung function and unusual behavior.

In the world’s largest study of the consequences of smoking during the first trimester of pregnancy the DNA methylation status of almost 900 new born babies was studied and compared with the DNA methylation of babies whose mothers did not smoke. It could clearly be shown that the methylation status between the two groups differed. Methylation can alter the activity of genes up to complete silencing. There is evidence that such methylation patterns can be inherited to later generations.

Affected genes belong to known developmental genes and also genes that are involved in tobacco addiction. This confirms the suspicion that tobacco addiction may already be induced during pregnancy. Despite the fact that women should quit smoking before they become pregnant (or better do not smoke at all) it has also to be considered that second-hand smoking is a permanent danger for unborn, child and adult health.

Update on NGS and Clinical Validation

Clinical validationThere is an increasing demand for the development of regulated next-generation sequencing based diagnostic tests. The review that I would like to draw your attention to is thoroughly discussing all challenges and issues that arise when developing NGS-based diagnostic tests or even CDx. The experts form the Merck Research Laboratories take very thing into account starting from the choice of the platform, bioinformatics through to the regulatory approval process.

Have a read, it’s really worth it!


Data analysis – still a bottleneck!

With the many NGS machines around in the field, we daily produce tremendous amounts of sequencing data. However, at the end of the day, all the data have to be analyzed and interpreted. In many cases, this step is still a bottleneck.

Please check the video below which is an interview with Lex Nederbragt, Bioinformatician at the Norwegian High-Throughput Sequencing Centre in Oslo, on this topic. He discusses the fact that the analysis tools which are available do not fully fulfill the needs of the researchers. In this context, he also discusses the use of open source and commercial software tools.

Lex Nederbragt discussing software bottlenecks and lack of flexible reference genomes from NGS Perspectives on Vimeo.

100,000, 40,000, 25,000, 19,000 – the shrinking human genome…

DNAFor sure many of you remember old textbooks, in which the total number of genes in the human genome was estimated around 40,000 to 100,000. After the human genome was sequenced this number shrunk to 26,000 – 40,000 genes. The 19th GENCODE release further reduced this number to 20,318 protein-coding genes. But not enough a recent study suggested that the actual number of protein-coding genes in humans lies around 19,000.

This astonishing result could be obtained by analyzing the data derived from seven large MS-based proteomics studies from more than 50 human tissues.

But the shrinking number of genes is not the only remarkable results – find below the most important results from this study as described in a recent ScienceDaily blog post:

  • Close to 12 000 human genes could be unambiguously identified
  • Despite high coverage from seven analyses, 40% of the peptides from the human gene set could not be detected; Possible reasons:
    • Thousands of genes annotated in the human genome did not appear in the proteomics analysis.
    • Apparently 1,700 genes that were previously thought to produce proteins most certainly don’t
  • Another hypothesis is that more than 90% of human genes produce proteins originating in metazoans or multicellular organisms living hundreds of millions of years ago
  • The difference between humans and primates at the gene and protein level is very small
  • “The number of new genes that separate humans from mice may even be fewer than 10”
  • Physiological and developmental differences between primates are more likely caused by gene regulation than by differences in the basic functions of proteins in question

Alfonso Valencia, the main researcher behind this project states that “the human genome is best annotated, but we still believe that 1,700 genes may have to be re-annotated”.

According to Alfonso Valencia these results may redefine the entire mapping of the human genome.

The Common Marmoset as a Model Organism for the Study of Drug Metabolism


marmosetSeveral non-human primates including Macaca mulatta and Macaca fascicularis are well known as experimental animals in the field of neuroscience, stem cell research, drug toxicology, and other applications. The common marmoset (Callithrix jacchus) is also a non-human primate and is suitable as experimental animal because of the small size and highfecundity.

For developing a drug metabolism model, our collaborators and Eurofins Genomics (2014) performed transcriptome analysis of the common marmoset using in parallel long-read technology (Roche GS FLX+) and short-read sequencing (Illumina HiSeq 2000). This parallel NGS approach resulted in both, the identification and the quantitative analysis of transcripts and thus giving insight into gene expression during drug metabolism. Finally we obtained rich information about genes involved in drug-metabolism including 18 cytochrome P450- and 4 flavin-containing monooxygenase -like (FMO) genes, and their tissue-specific expression patterns.

The results of this study are the foundation for future studies not limited to drug metabolism & pharmacokinetics.

First Oxford Nanopore MinIon data available: Is this the end of PacBio?

Nanopore SequencingResearchers from the University of Birmingham in the UK last week publicly released data they generated with Oxford Nanopore Technologies’ MinIon nanopore sequencer, the first group to do so since the company started its early access program this spring (see In Sequence report).

The sequence is derived from a Pseudomonas aeruginosa genome and is a single 8.5 kilobase read. It was posted by Nick Loman from the institute of Microbiology and Infection at the University of Birmingham. It was possible to identify the serotype O6. The sequence can be found here. It is of low quality with 71% identity of the spanned region.

Konrad Paszkiewicz, director of the Wellcome Trust Biomedical Informatics Hub and head of the sequencing service at Exeter, has been writing about the group’s experience on the Exeter Sequencing Service’s blog. “Even at this stage, this platform has the potential to steal large chunks out of the market from the likes of PacBio,” Paszkiewicz said.

We will have to wait for more data until we see how useful the technique will be and how the technique is able to compete against other Nanopore sequencers e.g. the device of Genia that was recently acquired by Roche.

Improvement of PacBio ZMW loading procedure by DNA Origami?

Since the launch of the PacBio system in 2011, there has been a constant development and improvement of the methods involved (e.g. former posts here).

OrigamiStar-BlackPen.pngHowever, efficient loading of the Zero-Mode Waveguides (ZMWs) with polymerase molecules still remains a challenge. The ZMWs are tiny wells in which the actual sequencing reactions take place. Each SMRT cell consists of 150,000 ZMWs. However, with current methods, only about 1/3 of the ZMWs is actually useable after loading. The polymerase molecules are loaded onto the ZMWs by simple diffusion – resulting in ZMWs which carry one, more than one, or no polymerase molecule. As a consequence, each SMRT cell typically generates only approx. 50,000 reads per run.

A group of researchers from the Technical University of Braunschweig, Germany, has now used “DNA Origami” in order to efficiently place molecules into ZMWs.

DNA origami is a fascinating technique which uses the unique properties of DNA in order to create nanostructures by “folding” DNA into the required shapes. A ground-breaking article on DNA origami has been written by Paul Rothemund in 2006.

The researchers from Braunschweig have now created “nanoadapters” which exactly fit the size of the ZMWs. As a consequence, there cannot be more than one molecule in a ZMW. The nanoadapters carry a fluorescent dye on top and biotin molecules on the bottom side. These biotin molecules serve in fixing the nanoadapters to the bottom of the ZMW via neutravidin. In principle, the fluorescent dye could be replaced by a polymerase molecule. This approach greatly increased the loading efficiency to approx. 60 percent.

However, according to InSequence, the research group did not co-operate with PacBio for this project. In parallel, PacBio is working on other methods to increase the loading efficiency of their SMRT cells. But I am sure that there will be (and has to be) an improvement soon- no matter by which methods.
OrigamiStar-BlackPen” by Aldaron, a.k.a. Aldaron. – From JillsArt, posted with permission. Licensed under Attribution via Wikimedia Commons.