Comparison of Exome Enrichment Technologies in Nature Biotechnology

Very recently researchers from Stanford University systematically investigated performance of the most widely used exome enrichment platforms:

  1. Roche/NimbleGen’s Seq Cap EZ Exome Library v2.0 (44 Mbp)
  2. Agilent’s Sure Select Human All Exon (50 Mbp)
  3. Illumina’s Tru Seq Exome (61 Mbp)

One of the findings of the study is: When comparing coverage efficiency at constant read depth (80 million reads each) NimbleGen Sequence capture is by far better than the other two platforms. With NimbleGen sequence capture 98.6 % of all targeted bases were covered at least 10x, while Agilent’s Sure Select and Illumina’s Tru Seq covered only 89.6 % and 90.0 % of all bases at least 10x. In my opinion, the different target sizes of the exomes should have been taken into account. In this case the read depth should have been normalized according to the exome sizes. Independent of the missing normalisation it is however clearly shown in the paper that the NimbleGen technology enriched a much higher percentage of the targeted bases than the other two products..

Other criteria that were compared are the off-target enrichment rate (NimbleGen performed best) as well as the enrichment bias owing to GC content (Agilent performed best).

The decision, which platform is best for a specific scientific question should also be influenced by the individual target regions covered by different Exome kits. Agilent’s and NimbleGen’s exomes share 38 Mbp of their target regions. Apart from that Agilent’s Exome covers better Ensembl genes, while NimbleGen’s Exome covers a greater portion of miRNAs. Illumina’s exome, although displaying low coverage efficiency, is designed to capture UTRs in addition, which by now are almost not covered by the other designs and is therefore the choice, if those regions are of interest.

Differences in the performance come from the different oligonucleotide designs. I therefore postulate similar key parameters when using the customised versions of the capture technologies.

Regina Dick

About Regina Dick

Regina is a driving supporter of next generation sequencing activities.

3 Responses to “Comparison of Exome Enrichment Technologies in Nature Biotechnology”

  1. Hi Regina. I’m the chief author of that paper. Thank you for a nice article about the paper!

    I’d like to address a couple of your comments:

    “In this case the read depth should have been normalized according to the exome sizes.”

    I understand why conceptually that seems like a good idea: Normalize by read depth across the target interval so you can compare how well the individual platforms’ baits performed. But that’s not really a fair comparison either because the genomic content of the target intervals is quite different as well. Doing that would not allow us to differentiate between the effect of genomic content and the effect of platform.

    To deal with that, we could have done efficiency across all regions shared by all three platforms with normalized read depth.

    I generally think read count is a far more useful stat in this kind of paper than normalized read depth precisely because the target regions are different and because ultimately the question is: “How much sequencing do I need to do?”

    There is this yearning from the community to stamp one of these platforms with a gold star and say it’s the best. A cursory glance at some of our figures might lead you to think Nimblegen is “the best”. But it certainly isn’t “the best” if you care about UTRs. Or if you’re willing to sequence 60M reads. I hope that came across in the paper. Different platforms might be best for different studies.

    And yes, I do think some of the findings translate to other target enrichment designs.

  2. Regina Dick

    Michael,

    Thank you for your comment about normalization of reads. I understand your point. To do a fair comparison between the enrichment methods, the coverage efficiency should only be evaluated across common regions of the platforms. Such a comparison would be of great interest, as it would help to evaluate which technology (Agilent SureSelect or NimbleGen enrichment) could be best for capturing a predetermined customized region. Comparison of the exomes with its different designs is most informative when comparing with a fixed read count.

    Best, Regina

  3. Nice exposition, and aisumng cabal thing notes. But I was wondering if you could include references, or methodology, for the stated error rates. This is an important issue for me, as a core facility director talking to clients about the technology. thanks.