Archive | General RSS feed for this section

Summary from 4th Next Generation Sequencing Congress 2012 – Part 2

Dear all,

Here is my second summary from 4th NGS Congress at London Heathrow end of 2012. It will bring to you some (hopefully) interesting new facts about sequencing with PacBio RS - the second long read technology present in the actual markets and also the only system delivering reads even longer than 10,000 bp…

Kevin Corcoran, Senior Vice President at Pacific Biosciences held an interesting and very nice talk about the most recent developments for the PacBio RS system. He also showed some nice detailed road maps about future aims and plans. One important thing actual to be mentioned is the launch of the new “XL Chemistry” – while still “C2 Chemistry” may be used as well. The other very interesting story is about “Stage Start” a new feature enabling a parallel start of all sequencing detection similar to the well-known “hot start” technology for PCR. Such detection of sequences better will start from a defined position for most of the libraries than starting from somewhere in the middle. Last but not least, I’m very keen to learn how the future “Photo Protected DNA Polymerases” may further develop – an idea being really very, very next-next-generation…

First of all I can summarize that applying “XL Chemistry” looks really interesting and this being true also in terms of Eurofins MWG Operon de novo sequencing and assembly focus.  This new feature of the PacBio RS machine may also open some new doors to other types of applications, while in general the need for extrem high data coverages may be reduced in parallel.

Currently “C2 Chemistry” is on the machine and running a 90 min video may deliver you about 20-50,000 reads and data outputs of 30-50 Mb – of course higher yields may be possible for “ideal” DNA samples. The average read length is about 3,000 bp (!), while the 95% percentile is about 8,000bp. With the new ”XL Chemistry” we got an average yield of about 40,000 reads per SMRT cell with an average read length of about 4,000bp (+30%). Overall, we are very pleased with these first results, especially since we see some good potential to further increase data yields using the new software pipeline started in parallel (Hierarchical Genome Assembly Process and Quiver).

— See picture 1: —

Kevin-Corcoran-Pacific-Biosciences_Seite_16

 

It is also important to mention two different ways of “How-to-deal” with the XL Chemistry. 1) ”XL chemistry for Polymerase binding”, but “C2 chemistry for sequencing”. This allows for longer reads at the same quality (currently we still do have a single error rate of 10% to 20 %, average maybe 15%). 2) “XL chemistry for Polymerase binding” AND ”XL chemistry for sequencing”. Such one can yield even longer reads, but unfortunately the error rate will also increase by a few %. Therefore this method is being recommended especially for de novo assembly or finishing genomes.

— See picture 2: —

Kevin-Corcoran-Pacific-Biosciences_Seite_52

 

Finally one real “next-next-gen” highlight was the presentation of a development at Pacific Biosciences scoping with the idea to protect the polymerase enzyme from being killed by the energy of the laser. A picture shows how this should work in principle - by setting in place a laser-light protecting sun-blocker - this story was really fascinating for me and I hope to see in future more than the very promising first data results …

— See picture 3: —

Kevin-Corcoran-Pacific-Biosciences_Seite_56

 

So over all Pacific BioSciences keeps also moving very fast in year 2013 and it will be very nice to see and learn how all these additional improvements and new features may  improve the overall data results of this fascinating very long read technology offering today real single reads longer than 10,000 bp.

Cheers now and see you on our next BLOG,
Axel

What Is In Your Genes?

Watch out the presentation of the SITN Boston talking about whole genome sequencing and its impact on personalised medicine.

Further recorded lectures given by graduate students at Harvard and focusing on hot topics in science research and news can be found at https://sitn.hms.harvard.edu/seminar-archive-2012/. Enjoy!

New Bid From Roche For Illumina?

The analyst and sequencing community is currently divided on whether to believe the rumors of a new bid from Roche to buy Illumina. The source of the controversial discussions is an article from the Swiss Newspaper L’Agefi that reported end of December that Roche and Illumina might have agreed to a deal for Roche to acquire Illumina. Since Illumina turned down Roche’s original bid in January, continuous interest from Roche has been reported several times, but the report from L’Agefi is also mentioning concrete amounts of the bid. According to them, the acquisition might take place for $66 per share, valuing the deal at about $8.14 billion in total.

The offer is $15 per share higher than the previous offer of $51 in April last year. According to the analyst Devia Ferreiro of Oppenheimer the new bid is definitely at a level that might lead to a final deal.

With Roche having only about 9% of the NGS market and next generation sequencing becoming most likely an important clinical diagnostic tool in the next years, the strategy focus of Roche must be to get better access to the NGS market and to take NGS to clinical practice. The acquisition of the NGS market leader Illumina represents an optimal starting point.

We’ll see if the rumors are built on a solid foundation within the next two weeks: The Swiss newspaper L’Agefi reported that the announcement might come during the first half of January.

Hybrid De Novo Genome Assemblies

What are your intentions when being interested in a bacterial or fungal de novo genome sequencing project?

Typical answers we get from our customers:

  • Easy working with the data
  • Data suitable for high quality annotation
  • Resolution of structural rearrangements
  • High consensus accuracy
  • High cost-efficiency

All these requirements can be fulfilled perfectly when combining Roche GS FLX++ and Illumina data. The long Roche FLX++ reads of up to 1100 bp give much longer contigs than Illumina reads only do. For scaffolding and to be able to resolve structural rearrangements we sequence shotgun (SG) and LJD libraries with Illumina technology. The adding of Illumina reads keeps the overall costs at a reasonable level. Furthermore the reads correct the Roche sequencing errors at homopolymer sites and therefore enable us to build a consensus sequence with high accuracy.

The superiority of such a hybrid assembly becomes quickly apparent when looking at the following results of one of our proof of concept studies. In this de novo project, we sequenced a fungal genome of about 30 Mbp and approx. 57% GC content. Using the hybrid strategy we obtained only 10 chromosome-sized scaffolds (see figure below) with up to 8.3 Mbp. Remarkably, the 10 scaffolds represent the majority of genetic information present, given that they make up 99.6% of all scaffold sequence information.

Such results enable easy data handling and definitely are an excellent starting point for annotation and studying of gene content and rearrangements.

Sequencing strategy: SG library with FLX++ (approx. 10-fold coverage), SG and LJD 3 kbp, 8 kbp and 20 kbp on Illumina HiSeq 2000 with 2x 100 bp module.

 

Tip: Inside The Wellcome Trust Sanger Institute

Do you know the blog of the Wellcome Trust Sanger Institute?

The Wellcome Trust Sanger Institute is one of the leading genomic research centres in Europe and a leader in the Human Genome Project. Within their blog they are talking about the role of genetics in health and disease by using the latest genomic and genetic techniques.

Read more at http://sangerinstitute.wordpress.com/

Seasonal Greetings

 

Adventitious Virus Testing Via Next Generation Sequencing

Adventitious viruses are a major safety concern in biological products. For a substance to be considered “free” of an adventitious agent, assays must demonstrate that a defined quantity of the biological product is negative for an agent at a defined level of sensitivity. In vivo animal testing, in vitro cell culture testing, transmission electron microscopy and molecular assays like quantitative PCR (qPCR) are the current gold standards for viral safety testing. However, if for example the cell substrate contains potential contaminating agents coming from a tumor derived cell line, then current standard methods need to be supplemented by using novel technologies.

Deep sequencing approaches via the next generation sequencing (NGS) techniques may be the method of choice. They allow the detection not only of known viruses but also of unknown viruses or viral subspecies at the detection limit of qPCR-based methods. On the Pathogen Safety Summit (Munich (Germany), November 27-28, 2012) the application of NGS testing approaches were introduced and intensely discussed. The application of NGS into routine testing of production cell banks is presently evaluated by several biological and vaccine producing companies.

Currently, NGS is used for initial characterisation of cell banks, but it iss expected that this new technology will become a standard method for adventitious agent testing soon. There are still challenges that need to be overcome with regard to bioinformatic analyses as well as to the speed of the technological development. Furthermore, also the biological relevance of the NGS data needs to be confirmed. In this regard the expectation is that with the ability to purify active viral particles and subject them to NGS analysis this problem can be overcome.

Btw: Eurofins Medigenomix offers the detection of adventitious viruses in biologicals and biotechnological products by next generation sequencing on platforms from Illumina and Roche 454.

Summary from 4th Next Generation Sequencing Congress 2012

Attending the 4th NGS Congress 2012 at London Heathrow I can give here some interesting new facts and information about latest NGS stories which are worth to be shared.

First of all let’s talk about “long read technology” – A Roche 454 talk has been given by Todd Arnold, Vice President R&D, Roche 454.  For Roche GS Junior a new software version 2.7, with  “improved well resolution results in better quality, more robust sequencing runs”  is now available.  As a matter of fact we can confirm these new data outputs while using on our own Junior platform with this update since a while.  Depending on your samples nature  a good part of all reads will be longer than 400 bp and up to 450-480 bp (still using the Titanium Chemistry). But the FLX+ technology is NOT available and also NOT planned for GS Junior - raising the question why,  no concret details or upgrade plans could be given for GS Junior at the London congress…

The real and major highlight about Roche 454 was the description of what we call now “FLX++” sequencing. A software update (2.8) being available now for all the GS FLX systems – together with  the “pimped chemsitry kits” – Roche 454 is offering real ”1000bp” Sanger-like reads (as initially aimed at launch).  Some data outputs and slides were shown that demonstrate these new and longer read lengths and also higher data outputs (figure 1). All together that counts up to almost ~1Gb of sequencing data per full PPT run.

Fig 1: Todd Arnold Roche 454 Data Heathrow 2012

Being one of the early access users of the FLX++ upgrades and software version 2.8, we can in fact confirm that the new data outputs are excellent (again depending on the quality of DNA) - in fact one can reach even better results than shown by Roche at the 4th NGS congress in London Heathrow. Here is an example:

Fig 2: Eurofins MWG Operon data with Roche GS FLX++

Of course one may argue now – “that’s nothing compared to Illumina data outputs” – and you are right in terms of the pure data volumes! But the focus here is on long read applications like e.g. sequencing and de novo assembly. And for this kind of NGS application, a modal read length of 800-950 bp or above will tune the final data outputs treamendously. You won’t believe? We can share with you some nice new project data that we have delivered for a fungal de novo sequencing project (figure 2). We were able to deliver chromosome-size scaffolds of 8.3 Mb, 6.0 Mb, 4.3 Mb, 2.8 Mb, 2.4Mb, 2.1 Mb, … when using a long read FLX++ back-bone sequencing at  8x-12x only and combining this data with short read LJD sequencing on HiSeq at 2x 100 bp. The complete data set missed only about 0.5% of all genetic information, while remaining average gap lenght was about 240 bp.  We are actually very interested to learn how 2x 250 bp read length on MiSeq will further improve this excellent data results – one shot genome sequencing at it’s best.

Interested in this kind of project data? Please learn more about our fascinating de novo sequencing & assembly results at our next NGS roadshow in 2013 or send me an email for further discussion about this topic…

How to benefit from our superior LJD’s on the MiSeq

With the update of our MiSeq system to 250 bp reads genome sequencing on this system gets even more important. But long reads and huge data output are not the only prerequisite for a great de novo assembly result.

What is missing?

Paired-end libraries that span gaps and repetitive structures can improve de novo genome assemblies tremendously. Our proprietary long jumping distance libraries (LJDs) are perfectly suited for scaffolding on Illumina sequencing devices. In contrast to other paired-end libraries (like Illumina mate pair library), our LJD library preparation involves an adaptor-guided ligation of the genomic fragments. The different preparation protocol offers the following advantages:

  • No hybrid reads – a unique sequence identifies the crossover points
  • No shotgun pairs – less than 1% of all LJD reads are shotgun paired-end reads
  • Distinct insert sizes – we prepare LJDs with 3, 8, 20 or even 40 kbp insert size
  • Span large repeats – large and complex repeats up to 40 kbp can be resolved

Mapped reads: All reads from a 3 kbp LJD library (grey) are aligned to a reference sequence. Two LJD read pairs are highlighted (blue + black) and their measured insert size is 3107 bp and 3002 bp respectively.

 

Why should I combine MiSeq long reads and LJDs?

The new features of the MiSeq (250 bp reads; data output up to 8 Gbp) enable the combined and cost-efficient approach of shotgun and LJD libraries in one run. The MiSeq output is sufficient to sequence several bacterial genomes or single fungal genomes (up to 60 Mbp) with appropriate coverage.

  • Longer reads – more sequence information to correctly map the reads onto your contigs
  • Short delivery time – due to the shorter run time compared to the HiSeq 2000

Read more about our long jumping distance libraries on our website

150 bp, 250 bp and next year 300 bp:
Illumina keeps the competition on the go

Illumina is currently in the midst of the MiSeq sequencer updates. The software update, the new flowcells and the new sequencing chemistry enable runs with outputs of around 8 Gbp and 250 bp read length. The first updates have reached Europe just recently and only a few days ago our own MiSeq has received the update.

That’s not the end of the story for Illumina. Just a week ago, they already have announced the next update. In the second half of 2013 Illumina is planning to offer another MiSeq update that will increase the output to 15 Gbp. They achieve this tremendous output for their benchtop device by increasing the read length to 300 bp and resolving about 25 million clusters on the flowcell.

Considering the intense competition with Life Tech’s Proton and Ion Torrent sequencer, Illumina needs to steadily improve the specs of their sequencing devices. In March, Life Tech plans to increase the output of their Proton sequencer to around 36 Gbp. That’s still a bit more than the new MiSeq upgrade can deliver, but one also has to evaluate the differences in the read length. While the MiSeq will be able to produce 300 bp reads soon thereafter, the Ion Proton is generating reads from 100 to 150 bp. And the difference is even more remarkably when the sequencing on the MiSeq is performed with the paired-end module – an approach that is not possible with Life Techs devices. By using library insert sizes of around 450 – 500 bp, the two overlapping reads can generate a single consensus read of about that size.

In my opinion the Illumina MiSeq is at the forefront of the race and if Illumina’s plan works out they will be there in 2013, too. But we all know how short-lived the NGS market is. So let’s see what’s coming!