Prepare NGS for clinical use

Molecular diagnostics (MDx) is to my opinion the most sensitive application for all kinds of molecular biology techniques like PCR, Sanger Sequencing or Next Generation Sequencing. Today, NGS is still a niche application and needs further improvement to be a common tool for MDx. One thing that is lacking is the standardisation of NGS for clinical use.

The NGS Working Group, established by the Friends of Cancer Research worked out a master plan (The ASCO Post), with critical points that need to be addressed to use NGS more commonly:

1. Define a regulatory pathway for cancer panels (a selection of multimarker gene assays) intended to identify actionable oncogenic alterations (those with supporting data to create risk-benefit assessment of treatment choice) that allow flexibility in the appropriate FDA medical device pathway—for instance, one based on risk classification of different panel components depending on the specific marker.

2. Approaches to validation studies should be based on the types of alterations measured by the assay rather than on every alteration individually.

3. Determine the contents of a cancer panel by classifying potential markers based on current utility in clinical care and clinical trials and peer-reviewed publications, as well as recognized clinical guidelines. Draw upon various sources to determine the recommended marker set for an actionable cancer panel.

4. Promote standardization of cancer panels through development and use of a common set of samples to ensure reproducibility on each platform.

5. Establish a framework for determining an appropriate reference method rather than relying on any single method for all studies.

Get more information to each proposal here.

Whole Genome Sequences Of World’s Oldest Living People Published

senior-asian-woman-100226669Researchers looked at the genome of some of the oldest living people. While they did not find a significant association with extreme longevity, the researchers published their genome findings. At least the data will be available as a resource for future researchers looking at the “genetic basis” of longevity.

There are 74 supercentenarians (110 years or older) alive worldwide, with 22 living in the United States. The authors of this study performed whole genome sequencing on 17 of them to explore the genetic basis underlying extreme human longevity.

“We were looking for a really simple explanation in a single gene,” said Stuart K. Kim, a Stanford geneticist and molecular biologist. “And we know now that it’s a lot more complicated, and it will take a lot more experiments and a lot more data from the genes of more supercentenarians to find out just what might account for their ages.”

From the limited sample size the researchers were not able to find protein-altering variants associated with extreme longevity, according to a study in PLOS ONE by Hinco Gierman from Stanford University and colleagues published November 12, 2014 . But they did find one supercentarian had a genetic variant related to a heart condition that had very little effect on his health considering he reached such and elderly age. The researchers noted that it is recommended by the American College of Medical Genetics and Genomics to report this instance as an incidental finding.

The whole genome sequences of all 17 supercentenarians are now available as a public resource so that they can be used to assist the discovery of the genetic basis of extreme longevity in future studies.

 

Compare to Large genome sequencing studies in the USA (posted August 26, 2014 )

Whose genome has been sequenced? Brassica napus

de-novo-sequencingBrassicas napus, also known as oilseed rape, was formed more than 7000 years ago by allopolyploidy (chromosome doubling from to Brassicas species). Of course the genome mutated further and so it is known today that during this evolution some genes were preserved and further “improved” (e.g. oil biosynthesis genes), whereas others were lost over the course of time (e.g. glusoinolate genes).

Chalhoub et. al now sequenced the genome, because it can help to “provide insights into allopolyploid evolution and its relationship with crop domestication and improvement” (Chalhoub et. al).

What was sequenced?

Young fresh leaves from the Brassica napus French homzygous winter line “Darmor-bzh“.

Sequencing strategy: Whole genome sequencing

  1. Libraries & Sequencing:
    Roche GS FLX: ~ 70 Million reads, Average Read length: ~ 368 bp, Genome coverage: 21.2 %
    Sanger BAC Seq: 141k reads, Read length: 650 bp; Genome coverage: 0.1%
    Illumina HiSeq:  ~375 Million reads, Read length: 36, 76, 108 and 150 bp, Genome coverage: 53.9%
  2. Data output: 44.146 contigs and 20.702 scaffolds
  3. Results: A final assembly of 849.7 Mb (using SOAP and Newbler) with 89% nongapped sequences.

After genome assembly the genome was mapped to other species (e.g. B. rapa and B. oleracea) and this helped to find several interesting genes and gene variation that help to understand the complete evolution better.

Read the complete publication here.

Whose Genome Has Been Sequenced? – Recent posts:

Think Big: The UK 100,000 Genome Project

In late 2012 the 100,000 genome project was launched. UK Prime Minister David Cameron announced a new initiative led by the National Health Service to sequence the genomes of up to 100,000 people and to use their genomic information in treatment and studies of cancer and other diseases. The government set aside 100 million GBP for this project.

hiseq-x-tenGenomics England which is heading the project now named 10 firms that have been selected to for the assessment of the next phase of the project. The companies are Congenica; Diploid; NantOmics; Genomics Ltd.; Illumina; Qiagen; Lockheed Martin; NextCode Health; Omicia; and Personalis.

As part of the recently completed stage, Genomics England in February sent out a questionnaire to 28 participants in relation to 10 cancer/normal samples and 15 rare disease trio samples.

Illumina is partnering as well and will contribute with the ultra-high throughput sequencing platform HiSeq XTM Ten.

What will be the next step? Sequencing everyone?

More Updates: Illumina & IonTorrent

Quarter 4 of 2014 seems to be another exciting one for Next Generation Sequencing. Beside the chemistry update for PacBio RSII also Illumina and IonTorrent / ThermoFisher announced two major improvements / achievements:

  • Chemistry update for the Illumina HiSeq X Ten and the HiSeq 2500 Rapid Run
    The new v2 reagent kit for the HiSeq X Ten supports a PCR-free sample preparation kit, which eliminates amplification during the library preparation. So far only sample preparation kits with PCR were possible, which sometimes results in a lower quality of challeging genomic regions.
    The new v2 reagent kit for the HiSeq 2500 enables users to sequence 2x 250 bp and the new chemistry therefore delivers up to 300 Gbp of data in only 60 hours. (Press Release)
    To my opinion Illumina proves once more that NGS is highly dynamic and that their continous update for existing systems is the key for their success (the latest financial report confirms that Q3 of 2014 with a growth of 10% is the strongest since 2011 for Illumina (Fierce Medical Devices)).
  • IonTorrent goes diagnostic
    The Ion PGM Dx System is now also CE-Marked for in vitro diagnostic (IVD) use in Europe. Thermo Fisher Scientific believes that the CE-mark “will enable European clinical laboratories to more easily […] implement new […] diagnostic assays” (Press Release).
    In September they announced already that the PGM is now listed with the U.S. FDA as a Class II Medical Device.
    To my opinion the clearance for diagnostic use in Europe as well as in the U.S. will further strengthen the position of the Ion PGM in clinical laboratories.

PacBio launches new chemistry and software

In a press release Pacific Biosciences announced the latest enhancement for the PacBio RS II single molecule DNA sequencer. The latest release of the polymerase 6 and chemistry 4 (P6 – C4) version in combination with improved software enhances the performance and output of the platform by 45%. The average read length is now 10,000 – 15,000 bases and up to 40,000 bases for the longest reads. Depending on the nature of the DNA a single SMRT cell will deliver 500 million to 1 billion bases.

The new chemistry will replace the current P5 – C3 chemistry and is recommended for all SMRT sequencing applications.

This new release also includes improvements to the SMRT Analysis software suite for long amplicon analysis and the Iso-Seq™ method. Together with chemistry enhancements, these advances boost accuracy, speed up analysis, and support sequencing of multiplexed amplicons of different sizes.

Do you want to share your biggest secret?

people_09Should we all get our genome sequenced? And share the information? Just today I read two articles in GenomeWeb regarding human genome sequencing. With, to my opinion, opposite views regarding sharing information from human genomes.

The first article is about the 23andMe project: Here two different groups of people said, that with the functionality “check for close relatives” box they ended up in real crisis in their family. In one case the parents divorced since the close relative box showed that the husband had already a child with another women (prior this marriage). And in the other case a girl found out that she has a brother, whom her mother has giving up for adoption.

So for me this is a clear indicator that simply sharing the genome information might really cause more problems than it can solve.

Exactly the opposite is asked for by George Church. From his point of view for eradicating diseases, creating unlimited energy sources and so on a public access to as many genomes (human and non-human) as possible is a prerequisite.

And I think I could agree to that partially, if we talk about bacteria or plant genomes. But I think we are not ready for a wide sharing of human genome information.

What also became clear to me is that we are not a lot further, than 2 years ago (Genomics – A Curse Or A Blessing?).

Genome sequencing identified Jack the Ripper

It is very likely, that the murders from Jack the Ripper are by far the best-known crime series in the world. The London police had six key suspects for the murders and one of them now could be identified as the killer (MailOnline).

The piece of evidence that was used to identify the murderer was a shawl found be one of the victims, that contained DNA from the victim as well as from the suspect. Using a whole genome sequencing approach, Dr. Louhelainen and his group extracted the 126-year-old DNA and compared it with descendants of the suspect. Read the complete article at DailyMail Online.

Are you ready to have your genome sequenced?

Genome sequencingLast month we asked if you would be interested in sequencing your genome. If the costs would be lower, the majority said “YES”.

More than 20% answered that their genome has already been sequenced. Personally, I would be very interested to know what they did with the data output.

 

If you are one of the guys who voted “I already have” please submit a comment why you decided to have your genome sequenced.

Whose Genome Has Been Sequenced? Belgica antarctica

de-novo-sequencingExtreme conditions require extreme actions. And this is what the midge Belgica antarctica has done. The midge lives exclusively in the Antarctic and in order to survive shrinked its genome to the smallest possible size. As of today, this is the smallest insect genome that has been sequenced.

Kelley et. al. now sequenced the genome of Belgica antarctica with the aim to learn more about how insects in general can adapt to the most extreme conditions.

What was sequenced?

Two fourth instar larva (Belgica antarctica) collected near Palmer Station, Antarctica.

Sequencing strategy: Whole genome sequencing & RNA-sequencing

  1. Libraries & Sequencing: 1 channel 2x 100 bp Illumina HiSeq 2000 (SG library (400 bp insert)) and one SMRT-cell of a 10 kb fragment library on PacBio RSII (P4 DNA Polymerase)
  2. Data output: 92 M paired-end reads from the shotgun sequencing with Illumina. These resulted in 5,422 contigs. Using the paired-end RNA-Seq data the number of contigs has been reduced to 5,064. Genome coverage with Illumina sequencing ~ 100x.
  3. Results: The total genome is ~ 99 Mbp.

For the PacBio sequencing a second larvae was used. But due to the low input of genomic DNA the PacBio data yielded only in a modest improvement in assembly. This underlines the need of a long-read sequencing technology with low input DNA material.

The de novo sequencing of the midge Belgica antarctica revealed that the smalll genome size is achieved by a reduction in repeats, TEs and intron size.

Read the complete publication here.

Whose Genome Has Been Sequenced? – Recent posts: