Whose Genome Has Been Sequenced? Belgica antarctica

de-novo-sequencingExtreme conditions require extreme actions. And this is what the midge Belgica antarctica has done. The midge lives exclusively in the Antarctic and in order to survive shrinked its genome to the smallest possible size. As of today, this is the smallest insect genome that has been sequenced.

Kelley et. al. now sequenced the genome of Belgica antarctica with the aim to learn more about how insects in general can adapt to the most extreme conditions.

What was sequenced?

Two fourth instar larva (Belgica antarctica) collected near Palmer Station, Antarctica.

Sequencing strategy: Whole genome sequencing & RNA-sequencing

  1. Libraries & Sequencing: 1 channel 2x 100 bp Illumina HiSeq 2000 (SG library (400 bp insert)) and one SMRT-cell of a 10 kb fragment library on PacBio RSII (P4 DNA Polymerase)
  2. Data output: 92 M paired-end reads from the shotgun sequencing with Illumina. These resulted in 5,422 contigs. Using the paired-end RNA-Seq data the number of contigs has been reduced to 5,064. Genome coverage with Illumina sequencing ~ 100x.
  3. Results: The total genome is ~ 99 Mbp.

For the PacBio sequencing a second larvae was used. But due to the low input of genomic DNA the PacBio data yielded only in a modest improvement in assembly. This underlines the need of a long-read sequencing technology with low input DNA material.

The de novo sequencing of the midge Belgica antarctica revealed that the smalll genome size is achieved by a reduction in repeats, TEs and intron size.

Read the complete publication here.

Whose Genome Has Been Sequenced? – Recent posts:

FacebookTwitterGoogle+Share

A major upgrade of SAMTools: CRAM format to reduce NGS data load

SAMTools, one of the most popular NGS sequence analysis tools has recently been upgraded by Computer scientists at the Wellcome Trust Sanger Institute. SAMTools is a set of utilities which allow the manipulation of alignments in the SAM/BAM format. SAM is the acronym for Sequence Alignment/Map format, whereas BAM is just the binary form of SAM. SAM can be seen as the worldwide standard for storing large nucleotide sequence alignments.

SAMTools 1.0, the revised version of the free program suite now allows researchers an improved handling of their sequencing data. Further to the existing SAM and BAM file formats, SAMTools now supports the new CRAM format. Basically, CRAM files are alignment files, just like BAM files – except that their size is reduced by 10 -30%. For better handling even greater compression – up to 100-fold – can be achieved in the “lossy” mode, that still preserves the most important information. The savings in storage that CRAM offers could be achieved by incorporating data compression techniques which were cooperatively developed by the Sanger Institute and the EMBL-European Bioinformatics Institute.

“This major rebuild of SAMTools reflects our commitment to supporting the global use of sequencing data,” says Dr Richard Durbin, Head of Computational Genomics at the Sanger Institute. “Genome science worldwide relies on fast and efficient data analysis and storage, and SAMTools 1.0 fulfills this need by supporting new sequencing and analysis technologies”. Dr. John Marshall from the Sanger Institute is highly optimistic that the widespread uptake of the new format will lead to lower data storage costs on a global scale (complete article).

I am curious on how the new format is going to be adapted by the genomic community. By the way did you know that SAMTools has been downloaded more than 225,000 times?

Quality Before Quantity?

The  “$1.000 genome” is to my knowledge the buzzword everyone knows when thinking about Next Generation Sequencing.

And quite often I ask myself: What will be the future of NGS? Whole genome sequencing of everyone and everything?

I am confident that this is part of Illumina’s  strategy for their new HiSeq X Ten instruments – at least for humans.

Contrary there is still all the data that needs to be analysed. And an interview with Lex Nederbragt highlights that data analysis is still a bottleneck. Also the latest report from Markets&Markets for whole exome sequencing predicts a strong growth in targetd sequencing. They estimate a growth for whole exome sequencing  from $326.6 M in 2013 up to $884.1 Million by 2018.

So will quality, like sequencing distinct regions outcompete the $1,000 genome? What are your thoughts about that?

p5rn7vb

Large genome sequencing studies in the USA

senior-asian-woman-100226669The launch of the Illumina HiSeq X Ten enabled them to put in practice their plans and great visions. I’m speaking of two persons of great influence in the United States. This year, Dr. J. Craig Venter, known for being one of the first to sequence the human genome and Patrick Soon-Shiong, considered as the world’s richest doctor, both revealed some details about their large scale sequencing projects:

  • J. Craig Venter founded the company Human Longevity that aims to develop treatments for cancer and age-related conditions. To unveil the mechanisms how people can live long and healthy lifes, the company will become one of the largest DNA sequencing facilities in the world. The plan is to set up a sequencing center that is capable of sequencing 40,000 human genomes a year.
  • Patrick Soon-Shiong recently announced that his company NantHealth is purchasing sequencers being able to sequence 22,000 genomes annually. The samples will derive from the 22,000 patients diagnosed with cancer annually at the 34 hospitals owned by Providence Health & Services. Consequently, the company from Soon-Shiong probably will become the first one using their sequencing capacity for clinical sequencing on a large scale.

Such huge collections of sequencing data make it possible to uncover the molecular causes of an complex process as aging or such a diverse and complex disease as cancer in a general approach. Big and very valuable databases will be created, that may contribute to develop new pharmaceuticals or develop personalized therapies.

The future of miRNA analysis

We asked you in which technology you see the future of miRNA analysis.
Find here the voting of the 102 participants:

miRNA

 

Unexpected Heroes

Image courtesy of FreeDigitalPhotos.netThere are several mutations known which are linked to childhood diseases. This knowledge is already being used e.g. to analyze genomes of sick newborns for any known diseases, or for prenatal diagnostics. However: A person carrying such a mutation must not necessarily get ill.

Some individuals carry a mutation that should have caused a severe disease in their childhood. However, some yet unknown factors have protected them from getting ill. Even though they may be very rare, studying such persons may help to understand more about the diseases, or even find new treatments.

Researchers of the “Resilience Project” are now looking for such individuals who they call “unexpected heroes”: Adults who are “resilient to a certain rare disease despite carrying genetic mutations that would indicate onset of the disease in childhood.” In order to find those rare individuals, they are asking for volunteers to donate DNA samples for the project. Since they expect only 1 of 20,000 individuals to be such an “unexpected hero”, they need to analyze the genomes of more than 100,000 individuals. Participants can register online and will receive a test kit by mail. In return, the volunteers get a report indicating whether any of the analyzed mutations have been found in his or her genome.

The researchers hope to identify genes that can “buffer” the effects of the mutations, as well as environmental factors which help people carrying the mutations to stay healthy. The goal is to find new treatments, or even prevent people from getting ill at all.

Note: NGS in Diagnostic Testing

Yes, this amazing technology is not just a tool for basic researcher anymore, but has made its way in to the clinical routine testing. It currently all about exome sequencing and targeted gene panel analysis, but whole genome sequencing is expected to come into clinical routine soon. Have a read through this comprehensive article which describes very nicely which applications are suitable for the diagnostic testing and which may come in the future.

Read the article about NGS in diagnostic testing

Why is Illumina so successful? Watch an interview with Illuminas CEO

In the 2nd quarter of 2014 Illumina reported adjusted earnings of 57% per share – most probably the biggest increase in the companies history. Watch this interview with the CEO of Illumina, Jay Flateley, to learn more about the reasons of Illumina’s success.

 

Epigenetic study confirms: Tobacco addiction during pregnancy

Courtesy of FreeDigitalPhotos.netIn the morning paper I found a very interesting article from Kathrin Zinkant about smoking during pregnancy (Sueddeutsche Zeitung, Wissen, July 31 2014). It is long known that smoking during pregnancy is taboo. However, estimated 5% – 10% of pregnant women in Germany still smoke, many of them because they are not aware of the pregnancy in the first trimester. Tobacco toxins can harm significantly. Known consequences are reduced weight at birth, damaged lung function and unusual behavior.

In the world’s largest study of the consequences of smoking during the first trimester of pregnancy the DNA methylation status of almost 900 new born babies was studied and compared with the DNA methylation of babies whose mothers did not smoke. It could clearly be shown that the methylation status between the two groups differed. Methylation can alter the activity of genes up to complete silencing. There is evidence that such methylation patterns can be inherited to later generations.

Affected genes belong to known developmental genes and also genes that are involved in tobacco addiction. This confirms the suspicion that tobacco addiction may already be induced during pregnancy. Despite the fact that women should quit smoking before they become pregnant (or better do not smoke at all) it has also to be considered that second-hand smoking is a permanent danger for unborn, child and adult health.

Update on NGS and Clinical Validation

Clinical validationThere is an increasing demand for the development of regulated next-generation sequencing based diagnostic tests. The review that I would like to draw your attention to is thoroughly discussing all challenges and issues that arise when developing NGS-based diagnostic tests or even CDx. The experts form the Merck Research Laboratories take very thing into account starting from the choice of the platform, bioinformatics through to the regulatory approval process.

Have a read, it’s really worth it!

http://journal.frontiersin.org/Journal/10.3389/fonc.2014.00078/full