Tag Archives: PacBio RS

Summary from 4th Next Generation Sequencing Congress 2012 – Part 2

Dear all,

Here is my second summary from 4th NGS Congress at London Heathrow end of 2012. It will bring to you some (hopefully) interesting new facts about sequencing with PacBio RS - the second long read technology present in the actual markets and also the only system delivering reads even longer than 10,000 bp…

Kevin Corcoran, Senior Vice President at Pacific Biosciences held an interesting and very nice talk about the most recent developments for the PacBio RS system. He also showed some nice detailed road maps about future aims and plans. One important thing actual to be mentioned is the launch of the new “XL Chemistry” – while still “C2 Chemistry” may be used as well. The other very interesting story is about “Stage Start” a new feature enabling a parallel start of all sequencing detection similar to the well-known “hot start” technology for PCR. Such detection of sequences better will start from a defined position for most of the libraries than starting from somewhere in the middle. Last but not least, I’m very keen to learn how the future “Photo Protected DNA Polymerases” may further develop – an idea being really very, very next-next-generation…

First of all I can summarize that applying “XL Chemistry” looks really interesting and this being true also in terms of Eurofins MWG Operon de novo sequencing and assembly focus.  This new feature of the PacBio RS machine may also open some new doors to other types of applications, while in general the need for extrem high data coverages may be reduced in parallel.

Currently “C2 Chemistry” is on the machine and running a 90 min video may deliver you about 20-50,000 reads and data outputs of 30-50 Mb – of course higher yields may be possible for “ideal” DNA samples. The average read length is about 3,000 bp (!), while the 95% percentile is about 8,000bp. With the new ”XL Chemistry” we got an average yield of about 40,000 reads per SMRT cell with an average read length of about 4,000bp (+30%). Overall, we are very pleased with these first results, especially since we see some good potential to further increase data yields using the new software pipeline started in parallel (Hierarchical Genome Assembly Process and Quiver).

— See picture 1: —

Kevin-Corcoran-Pacific-Biosciences_Seite_16

 

It is also important to mention two different ways of “How-to-deal” with the XL Chemistry. 1) ”XL chemistry for Polymerase binding”, but “C2 chemistry for sequencing”. This allows for longer reads at the same quality (currently we still do have a single error rate of 10% to 20 %, average maybe 15%). 2) “XL chemistry for Polymerase binding” AND ”XL chemistry for sequencing”. Such one can yield even longer reads, but unfortunately the error rate will also increase by a few %. Therefore this method is being recommended especially for de novo assembly or finishing genomes.

— See picture 2: —

Kevin-Corcoran-Pacific-Biosciences_Seite_52

 

Finally one real “next-next-gen” highlight was the presentation of a development at Pacific Biosciences scoping with the idea to protect the polymerase enzyme from being killed by the energy of the laser. A picture shows how this should work in principle - by setting in place a laser-light protecting sun-blocker - this story was really fascinating for me and I hope to see in future more than the very promising first data results …

— See picture 3: —

Kevin-Corcoran-Pacific-Biosciences_Seite_56

 

So over all Pacific BioSciences keeps also moving very fast in year 2013 and it will be very nice to see and learn how all these additional improvements and new features may  improve the overall data results of this fascinating very long read technology offering today real single reads longer than 10,000 bp.

Cheers now and see you on our next BLOG,
Axel

Further Improvements of PacBio Technology

Recently, we have reported on the Studies of the Broad Institute, showing that the PacBio RS system was able to outdo MiSeq sequencing regarding validation of SNP analysis. Now Pacific Biosciences have taken another important step to further improve their product.

Pacific Biosciences have now launched a new Sample Loading Device for the PacBio RS, called MagBead Station.  As  Michael Hunkapiller, Ph.D. President and Chief Executive Officer of Pacific Biosciences told in their press release, they expect that with the new device, customers will  “be able to generate 10 kilobase-sized libraries using as little as one microgram of sample, a five to 10-fold improvement from where we were just a few months ago”. Also, because the new process is more robust, they expect that sequencing results will have higher overall consistency, allowing to run experiments also on challenging samples.

First experiences of early-access-customers seem to underline these expections:

As Patrick Hurban of Expression Analysis told InSequence, the new loading device allowed them to recover sequences also for “difficult” samples: “we’re much more confident on a sample-by-sample basis that we will be able to get good sequence”, he said. Also, they could confirm that the amount of library that needs to be loaded is now significantly lower. The new loading process also seems to favor longer DNA fragments over shorter ones, excluding short contaminating DNA fragments. This results in a greater percentage of long reads in a run. Also, the loading process now seems to work as efficiently for the large insert libraries as it does for the smaller insert libraries.

With the new loading device, about 50-60 % of the ZMWs are now active after loading. This is a great improvement compared to 30-45 % of active ZMWs before the upgrade.

When PacBio started on the market, I was impressed by the sophisticated new technology. However, the results of the first projects were rather disappointing. The new loading device now seems to greatly improve the sample loading step. However, the high error rates still remain a challenge, with about 15% for the time being. Pac Bio will need to solve those issues if they want to be successful on the market in the long run. However, it seems that by and by, PacBio is overcoming  its “childhood diseases”.

Comparison of NGS technologies – just a waste of time?

As already mentioned in our latest blog post Michael Quail and his team from the Sanger Institute published a comparison of the Ion Torrent PGM, the PacBio RS system and the Illumina MiSeq (BMC Genomics). This study and all the others performed recently couldn’t determine one clear winner as each system has its own advantages.

But really interesting are now the statements of the spokespersons from the different companies in a recent article from Julia Karow in GenomeWeb. They all agree on the same thing: the data collected in the publication have been true in 2011, but are outdated by now since a lot of effort is put into innovation. Every instrument performs a lot better now. So what is our conclusion? That comparisons for NGS technologies are just a waste of time? For the Sanger institute it means that they invested in 3 new MiSeq’s since the Illumina pipeline is already available. For me, these comparisons are also valuable for all other institutes. Although maybe outdated, they highlight the strength and weaknesses of each technology and help to decide where to invest thousands of dollars. What do you think?

PacBio RS Data to Validate SNPs Called from Illumina Sequencing?

Would you have thought that PacBio RS sequences with about 15% single read error rate can outdo MiSeq reads in validation of the variants called by WGS or Exome Sequencing? Personally, I wouldn’t have thought so. But the study of the Broad Institute published a few days ago clearly shows that they can.

Variants called within projects that aim at analysis of variants definitely need validation to determine the rate at which the mutations have been correctly called and to confirm the specific reported changes. Currently used techniques like Sequenom genotyping and Sanger sequencing provide essential drawbacks, such as the need for manual interpretation or low data throughput. For that reason, Carneiro and his colleagues studied the power of PacBio RS and MiSeq data as a validation tool and compared the results with each other.

They generated amplicons covering 98 variants called in the 1000 Genomes Project and sequenced the PCR products with both instruments, PacBio RS and MiSeq. Using PacBio RS data 96 out of the 98 variants could be correctly genotyped, whereas the MiSeq correctly genotyped only 93 sites. The explanation of the authors is quite simple: The completely random distribution of errors across the reads can overcome the low read accuracy problem if sufficient coverage is applied.

Manual checking of the sites, that were miscalled using the PacBio dataset, revealed, that one of the two miscalls happened due to a reference bias (true variation is hidden). Such bias is introduced by alignment parameters where the gap open penalty is higher than the base mismatch penalty. The high error rate of PacBio RS reads makes these parameters necessary.

However, Carneiro told GenomeWeb, that the researchers are not using a different aligner that was developed at the Broad Institute. This aligner re-aligns the reads using different parameters and therefore reduces the problem to a great extent.

For me the study shows that there is potential for PacBio RS sequencing. Nevertheless, like the variants, also this study result needs to be validated. Furthermore I think that the value of the study needs also to be seen in relation to the sequencing cost for both instruments. While the consumable prices for both techniques are in a similar range, the several fold higher cost for the PacBio RS instrument makes a remarkable difference.

Sequencing Performance versus Marketing Performance

Recently, a number of groups have attempted to compare the two platforms PGM and MiSeq, including the Sanger Institute a group from the University of Birmingham, and BGI. None of these studies have conclusively named a winner, and each group comes to slightly different conclusions.

In a blog of Genome Web’s “The Daily Scan” the different findings in the three comparison studies are discussed heavily. On the one hand different chemistries or older versions are compared with newer ones, on the other hand different application require different technologies.

According to a report by Jon Groberg at Macquarie Equities Research, Groberg cites several factors leading to Life Tech’s better selling success of the PGM over Illumina’s MiSeq (1300 vs. 700 systems sold): price — the PGM sells for $75,000, while the MiSeq goes for $125,000; Life has a more extensive commercial reach; the trajectory of improvement for the PGM is greater than for the MiSeq; and the PGM excels at certain key applications.

Of note are the differences in sequencing cost, based on list prices (see Sanger Institute study). The MiSeq came out cheapest, at $502 per gigabase, followed by the PGM, at $1,000 per gigabase using the Ion 318 chip, and the PacBio, at $2,000 per gigabase. All three platforms produce data at a greater cost than the Illumina GAIIx, at $148 per gigabase, and the HiSeq 2000, at $41 per gigabase.

What is your experience with the two systems?

Base Modification Detection with Pacific BioSciences

After having launched the new C2 chemistry for PacBio RS sequencing with longer read length it has been quiet for a while with Pacific BioSciences. However, a few days ago they have again attracted attention by launching a new analysis software that indicates base-modifications in the sequencing data. And from what I hear and read about these techniques, the epigenetics market could really be a great success story for Pacific Biosciences.

As PacBio’s SMRT sequencing is observing the DNA polymerization in real time it allows not only to decode the sequence, but also to study kinetic characteristics of the process. The kinetics of a base incorporation is characteristically changed by the presence of a modified base in the template strand and therefore can be used to distinguish between different base modifications. Different modifications result in different signatures (or fingerprints) that vary in signal magnitude and the length of the region over which the kinetics are altered.

I think that the study of base-modifications with PacBio RS has several advantages compared to experiments like methyl-Seq or bisulfite sequencing. On the one hand side PacBio RS sequencing is a direct detection, where no enzymatic restriction or bisulfite conversion has to be applied upfront. On the other hand – and this is the most important advantage for me – the PacBio RS system allows to distinguish a wide spectrum of base modifications, which has not been possible so far.

Unfortunately, the recently launched software is not yet ready to distinguish the different types of modifications, it only flags positions where modifications are present. However the company has shown proof of principle data and has already stated that the information to discriminate between modifications will be incorporated into future releases of the software. Moreover a Technical Note is provided from the company regarding their motif identification tool for bacterial methylomes.

Which sequencing strategy do you use for scaffolding of contigs?

In our latest poll that started mid of November 2011 we raised the question about your sequencing strategies for scaffolding projects. 29 ngs-expert.com readers did submit their votes.

39% of all votes agree my own opinion that LPE and LJD libraries are the preferred method for scaffolding of contigs. Long reads of up to 40 kbp can be easily and efficiently bridged.

But despite that, it is also obvious that all other techniques are still used for scaffolding projects. And I am still interested to see whether this might change with the new C2 chemistry for PacBio RS that is announced for Q1.

Will PacBio RS Sequencing Enter Your Research the Next Year?

The poll did run for 1.5 months.
32 ngs-expert.com readers did submit their votes.

41% will stick to “classical” technologies for the near future.
22% have no idea yet.
19% are planning to buy the PacBio RS sequencer.

Just cklick the image to see all results.


Future of Strobe Sequencing with PacBio RS

Strobe sequencing is one of the three sequencing protocols of the PacBio RS that researchers have been waiting for some time. By switching the data acquisition on and off, the machine sequences stretches of DNA in bursts. On periods generate the so called strobe subreads while off periods determine the distance between individual subreads. The most important application of strobe sequencing is the improvement of de novo assemblies by scaffolding contigs. 

However, recently, PacBio’s CEO Hugh Martin announced that the company will no longer focus on further development of this technology. As the PacBio RS chemistry upgrade which is planned for Q4 will deliver average read length of 2700 bp with 5 % of reads > 5100 bp, strobe sequencing is getting more and more obsolete.

The performance of the protocol still is a challenge in praxis as early-access customers reported in May in Genomeweb. This is in accordance with our experience. First, sequencing does only generate few thousands of reads per SMRT cell, and second, many reads are observed having only one or two subreads.