Tag Archives: Pacific Biosciences

PacBio Forecast 2015

ID-10081802As already predicted, it is not only Illumina who communicates innovations for their NGS portfolio. Here you can read about the implementations Pacific Biosciences plans this this. I think the good news for many users of PacBio machines is, that they do not talk about new instruments, but improvments that affect already installed machines (GenomeWeb):

  • PacBio plans to improve the sequencing chemistry, including the active loading of single polymerase enzymes onto the chip
  • PacBio plans to improve the workflows for an easier and faster handling of samples
  • PacBio plans to improve bioinformatics for faster de novo genome assemblies & better analysis of full-length HLA analysis

With this changes PacBio wants to extend the data output to more than 4 gigabases / SMRT cell and increase the average read lengths to 15-20 kbp.

Read more about it here.

I still wonder if there will be news from PacBio this year about a new system? Maybe a benchtop like everyone has?

I will keep you updated!

PacBio launches new chemistry and software

In a press release Pacific Biosciences announced the latest enhancement for the PacBio RS II single molecule DNA sequencer. The latest release of the polymerase 6 and chemistry 4 (P6 – C4) version in combination with improved software enhances the performance and output of the platform by 45%. The average read length is now 10,000 – 15,000 bases and up to 40,000 bases for the longest reads. Depending on the nature of the DNA a single SMRT cell will deliver 500 million to 1 billion bases.

The new chemistry will replace the current P5 – C3 chemistry and is recommended for all SMRT sequencing applications.

This new release also includes improvements to the SMRT Analysis software suite for long amplicon analysis and the Iso-Seq™ method. Together with chemistry enhancements, these advances boost accuracy, speed up analysis, and support sequencing of multiplexed amplicons of different sizes.

First Oxford Nanopore MinIon data available: Is this the end of PacBio?

Nanopore SequencingResearchers from the University of Birmingham in the UK last week publicly released data they generated with Oxford Nanopore Technologies’ MinIon nanopore sequencer, the first group to do so since the company started its early access program this spring (see In Sequence report).

The sequence is derived from a Pseudomonas aeruginosa genome and is a single 8.5 kilobase read. It was posted by Nick Loman from the institute of Microbiology and Infection at the University of Birmingham. It was possible to identify the serotype O6. The sequence can be found here. It is of low quality with 71% identity of the spanned region.

Konrad Paszkiewicz, director of the Wellcome Trust Biomedical Informatics Hub and head of the sequencing service at Exeter, has been writing about the group’s experience on the Exeter Sequencing Service’s blog. “Even at this stage, this platform has the potential to steal large chunks out of the market from the likes of PacBio,” Paszkiewicz said.

We will have to wait for more data until we see how useful the technique will be and how the technique is able to compete against other Nanopore sequencers e.g. the device of Genia that was recently acquired by Roche.

Improvement of PacBio ZMW loading procedure by DNA Origami?

Since the launch of the PacBio system in 2011, there has been a constant development and improvement of the methods involved (e.g. former posts here).

OrigamiStar-BlackPen.pngHowever, efficient loading of the Zero-Mode Waveguides (ZMWs) with polymerase molecules still remains a challenge. The ZMWs are tiny wells in which the actual sequencing reactions take place. Each SMRT cell consists of 150,000 ZMWs. However, with current methods, only about 1/3 of the ZMWs is actually useable after loading. The polymerase molecules are loaded onto the ZMWs by simple diffusion – resulting in ZMWs which carry one, more than one, or no polymerase molecule. As a consequence, each SMRT cell typically generates only approx. 50,000 reads per run.

A group of researchers from the Technical University of Braunschweig, Germany, has now used “DNA Origami” in order to efficiently place molecules into ZMWs.

DNA origami is a fascinating technique which uses the unique properties of DNA in order to create nanostructures by “folding” DNA into the required shapes. A ground-breaking article on DNA origami has been written by Paul Rothemund in 2006.

The researchers from Braunschweig have now created “nanoadapters” which exactly fit the size of the ZMWs. As a consequence, there cannot be more than one molecule in a ZMW. The nanoadapters carry a fluorescent dye on top and biotin molecules on the bottom side. These biotin molecules serve in fixing the nanoadapters to the bottom of the ZMW via neutravidin. In principle, the fluorescent dye could be replaced by a polymerase molecule. This approach greatly increased the loading efficiency to approx. 60 percent.

However, according to InSequence, the research group did not co-operate with PacBio for this project. In parallel, PacBio is working on other methods to increase the loading efficiency of their SMRT cells. But I am sure that there will be (and has to be) an improvement soon- no matter by which methods.
OrigamiStar-BlackPen” by Aldaron, a.k.a. Aldaron. – From JillsArt, posted with permission. Licensed under Attribution via Wikimedia Commons.

Running a NGS Company Is Profitable

A recent article in GEN (Genetic Engineering & Biotechnolgy News) ranked the 2012 salaries of the TOP 20 CEO’s of life sciences tools and technologies providers.

Amongst them: the CEO’s of PacificBiosciences, Illumina, Life Technologies:

Michael W. Hunkapiller, Ph.D. (PacBio) $2,177,002

Jay T. Flatley (Illumina)  $8,171,080

Gregory T. Lucier (LifeTech) $10,268,445

But although this is really impressive, it’s still not outrageous. Imagine being a 28-year old soccer player: you can earn 4times as much as G.T. Lucier … (Forbes)

The complete article can be found here.

High-Throughput Sequencing Machines By Platform

The High-Throughput Sequencing map by James Hadfield (Cancer Research UK, Cambridge) gives us a very interesting overview about sequencing activities around the world. We ran a survey to find out if your favourite machines correspond with the platforms listed by James in his overview.

Here are the results: Your personal favourites are nearly a perfect match with platforms in the genome centers worldwide. Great match!



Summary from 4th Next Generation Sequencing Congress 2012 – Part 2

Dear all,

Here is my second summary from 4th NGS Congress at London Heathrow end of 2012. It will bring to you some (hopefully) interesting new facts about sequencing with PacBio RS – the second long read technology present in the actual markets and also the only system delivering reads even longer than 10,000 bp…

Kevin Corcoran, Senior Vice President at Pacific Biosciences held an interesting and very nice talk about the most recent developments for the PacBio RS system. He also showed some nice detailed road maps about future aims and plans. One important thing actual to be mentioned is the launch of the new “XL Chemistry” – while still “C2 Chemistry” may be used as well. The other very interesting story is about “Stage Start” a new feature enabling a parallel start of all sequencing detection similar to the well-known “hot start” technology for PCR. Such detection of sequences better will start from a defined position for most of the libraries than starting from somewhere in the middle. Last but not least, I’m very keen to learn how the future “Photo Protected DNA Polymerases” may further develop – an idea being really very, very next-next-generation…

First of all I can summarize that applying “XL Chemistry” looks really interesting and this being true also in terms of Eurofins MWG Operon de novo sequencing and assembly focus.  This new feature of the PacBio RS machine may also open some new doors to other types of applications, while in general the need for extrem high data coverages may be reduced in parallel.

Currently “C2 Chemistry” is on the machine and running a 90 min video may deliver you about 20-50,000 reads and data outputs of 30-50 Mb – of course higher yields may be possible for “ideal” DNA samples. The average read length is about 3,000 bp (!), while the 95% percentile is about 8,000bp. With the new “XL Chemistry” we got an average yield of about 40,000 reads per SMRT cell with an average read length of about 4,000bp (+30%). Overall, we are very pleased with these first results, especially since we see some good potential to further increase data yields using the new software pipeline started in parallel (Hierarchical Genome Assembly Process and Quiver).

— See picture 1: —



It is also important to mention two different ways of “How-to-deal” with the XL Chemistry. 1) “XL chemistry for Polymerase binding”, but “C2 chemistry for sequencing”. This allows for longer reads at the same quality (currently we still do have a single error rate of 10% to 20 %, average maybe 15%). 2) “XL chemistry for Polymerase binding” AND “XL chemistry for sequencing”. Such one can yield even longer reads, but unfortunately the error rate will also increase by a few %. Therefore this method is being recommended especially for de novo assembly or finishing genomes.

— See picture 2: —



Finally one real “next-next-gen” highlight was the presentation of a development at Pacific Biosciences scoping with the idea to protect the polymerase enzyme from being killed by the energy of the laser. A picture shows how this should work in principle – by setting in place a laser-light protecting sun-blocker – this story was really fascinating for me and I hope to see in future more than the very promising first data results …

— See picture 3: —



So over all Pacific BioSciences keeps also moving very fast in year 2013 and it will be very nice to see and learn how all these additional improvements and new features may  improve the overall data results of this fascinating very long read technology offering today real single reads longer than 10,000 bp.

Cheers now and see you on our next BLOG,

Further Improvements of PacBio Technology

Recently, we have reported on the Studies of the Broad Institute, showing that the PacBio RS system was able to outdo MiSeq sequencing regarding validation of SNP analysis. Now Pacific Biosciences have taken another important step to further improve their product.

Pacific Biosciences have now launched a new Sample Loading Device for the PacBio RS, called MagBead Station.  As  Michael Hunkapiller, Ph.D. President and Chief Executive Officer of Pacific Biosciences told in their press release, they expect that with the new device, customers will  “be able to generate 10 kilobase-sized libraries using as little as one microgram of sample, a five to 10-fold improvement from where we were just a few months ago”. Also, because the new process is more robust, they expect that sequencing results will have higher overall consistency, allowing to run experiments also on challenging samples.

First experiences of early-access-customers seem to underline these expections:

As Patrick Hurban of Expression Analysis told InSequence, the new loading device allowed them to recover sequences also for “difficult” samples: “we’re much more confident on a sample-by-sample basis that we will be able to get good sequence”, he said. Also, they could confirm that the amount of library that needs to be loaded is now significantly lower. The new loading process also seems to favor longer DNA fragments over shorter ones, excluding short contaminating DNA fragments. This results in a greater percentage of long reads in a run. Also, the loading process now seems to work as efficiently for the large insert libraries as it does for the smaller insert libraries.

With the new loading device, about 50-60 % of the ZMWs are now active after loading. This is a great improvement compared to 30-45 % of active ZMWs before the upgrade.

When PacBio started on the market, I was impressed by the sophisticated new technology. However, the results of the first projects were rather disappointing. The new loading device now seems to greatly improve the sample loading step. However, the high error rates still remain a challenge, with about 15% for the time being. Pac Bio will need to solve those issues if they want to be successful on the market in the long run. However, it seems that by and by, PacBio is overcoming  its “childhood diseases”.

How Many More Next Generation Sequencer Are Needed?

Recently Investment Bank William Blair lowered top-line and bottom-line estimates for Illumina and Pacific Biosciences, citing government funding worries that could impact sales of both firms’ instruments <genomeweb>.

They lowered the forecast for shipping of 260 Illumina instruments in 2012 and 248 instruments in 2013. They also report a recent decrease of HiSeq consumables and lowered the forecast for consumable sales in 2012 by 3%. They predict also only a slight increase of 5% for consumable sales in 2013 over 2012.

If the shipping of 248 instruments increases the consumable sales only by 5%, than I have to wonder, how many of these instruments are really in use. If 248 instruments in average need 5% of the consumables this would mean, that at that time 4960 instruments are placed, which is far away from reality. The conclusion can only be that in average the instruments are used at less than 20% capacity. 

A huge amount of research money is used for buying instruments, instead of sourcing the service. As a consequence it takes long to fill a flow cell and the operators often have limited experience with sample preparation, data handling and analysis. This produces often pure data quality and is not helpful for high end research.

I am very curious about your opinion.

Comparison of De Novo Assemblies of the Escherichia coli Outbreak

The E. coli EAHEC outbreak inGermany has been an opportunity to compare currently available sequencing technologies with respect to the data quality.

Regarding the N50 contig size and amount of contigs/scaffolds best assembly quality so far was achieved using the long read technologies in the market, Roche’s GS Junior sequencing and Pacific Biosciences’ PacBio RS sequencing. For further comparison I am therefore going to focus on data of both long read technologies. But first, please have a look at the sequencing layouts:

Sequencing layout PacBio RS (Source: Pacific Biosciences >):

First library: Standard sequencing library (200-fold coverage)
Second library: Circular consensus sequencing library (35-fold coverage)
Sequencing: 56 SMRT cells

Sequencing layout Roche GS Junior
(Source: UK Health Protection Agency HPA >):

First library: Shotgun library
Second library: Long paired end library (LPE, 8 kbp insert length)
Sequencing: Three Roche GS Junior runs (25-fold coverage)

Comparison of the results

The de novo assembly is comprised of 33 contigs with PacBio RS sequencing and 13 scaffolds with Roche GS Junior. The N50 contig size of the PacBio sequencing approach is 402 kbp and the N50 scaffold size of the Roche 454 sequencing approach is 968 kbp. Both de novo assemblies > with Illumina MiSeq and Ion Torrent PGM data so far revealed higher amount of contigs and considerably shorter N50 contig sizes (95 kbp and 50 kbp, respectively).

This data once again shows that de novo sequencing strictly needs long reads. The advantageous effect of the very long reads >  of PacBio for scaffolding (on average 2900 bp and 5% longer than 5100 bp) is balanced in the other approach by sequencing of the LPE library.

Important to mention is that the PacBio assembly is generated with reads from a not yet released chemistry (planned for quarter 4). In contrast Roche 454 assembly did not contain the long FLX+ chemistry reads that will become available for GS FLX by the end of the month. According to our experience a read length of 650 – 750 bp will have some additional positive effect on number of scaffold and N50 scaffold size.

Most striking for me is that as much as 56 flow cells were needed to generate the PacBio assembly with the high consensus accuracy > of 99.998 %. The standard library was sequenced with that high coverage in order to increase the number of very long reads and the circular consensus sequencing library was employed for further correction of errors derived of still low single read accuracy.