Tag Archives: WGS

PacBio RS Data to Validate SNPs Called from Illumina Sequencing?

Would you have thought that PacBio RS sequences with about 15% single read error rate can outdo MiSeq reads in validation of the variants called by WGS or Exome Sequencing? Personally, I wouldn’t have thought so. But the study of the Broad Institute published a few days ago clearly shows that they can.

Variants called within projects that aim at analysis of variants definitely need validation to determine the rate at which the mutations have been correctly called and to confirm the specific reported changes. Currently used techniques like Sequenom genotyping and Sanger sequencing provide essential drawbacks, such as the need for manual interpretation or low data throughput. For that reason, Carneiro and his colleagues studied the power of PacBio RS and MiSeq data as a validation tool and compared the results with each other.

They generated amplicons covering 98 variants called in the 1000 Genomes Project and sequenced the PCR products with both instruments, PacBio RS and MiSeq. Using PacBio RS data 96 out of the 98 variants could be correctly genotyped, whereas the MiSeq correctly genotyped only 93 sites. The explanation of the authors is quite simple: The completely random distribution of errors across the reads can overcome the low read accuracy problem if sufficient coverage is applied.

Manual checking of the sites, that were miscalled using the PacBio dataset, revealed, that one of the two miscalls happened due to a reference bias (true variation is hidden). Such bias is introduced by alignment parameters where the gap open penalty is higher than the base mismatch penalty. The high error rate of PacBio RS reads makes these parameters necessary.

However, Carneiro told GenomeWeb, that the researchers are not using a different aligner that was developed at the Broad Institute. This aligner re-aligns the reads using different parameters and therefore reduces the problem to a great extent.

For me the study shows that there is potential for PacBio RS sequencing. Nevertheless, like the variants, also this study result needs to be validated. Furthermore I think that the value of the study needs also to be seen in relation to the sequencing cost for both instruments. While the consumable prices for both techniques are in a similar range, the several fold higher cost for the PacBio RS instrument makes a remarkable difference.

Euro Crisis – How Much of NGS Sequencing Could We Do For It?

Dear honourable reader of our NGS blog,

 

Actually IWF and European country leaders discuss a raise of the European secure funds to about 1.5 Trillion Euro (German = 1,5 Billiarden Euro). This is such a high number that I could not even imagine it…

This number also gave raise to the question: “How much of NGS sequencing or Whole Human Genome Sequencing (WGS) could be done with it?”

 

Step 1:  How many people are needed to pay 1.5 Trillion Euro of income tax?

With an average of 10,000 Euro of income tax per year 4,500,000 people would need to pay for 35 years (life time of work) to account for 1.575 Trillion Euro (1,575,000,000,000 Euro). Or more than 150 million people (150,000,000) are needed to work and pay tax money for one year – of course without any interest.

The 27 European Union countries (EU27) currently have a population of about 501 million people including babies, pensioners and all other non-tax payers (year 2011). Such about 30% of all people in EU27 need to work for one year to earn this sum of funding – of course without spending money for anything else…

 

Step 2: How many Human genomes would I get for this sum?

At a reasonable cost of 15,000 Euro per genome this equals 100 million (100.000.000) sequenced genomes  – that is about 1/5 of all EU27 people. At a discounted offer of 5,000 Euro per genome this equals 300 million (300,000,000) sequenced genomes – that is 2/3 of all EU 27 people. At a best price offer of about 4,000 $ (= 3,000 Euro) the money would allow to sequence about 500 million (500,000,000) genomes which is the complete population of EU 27.

Whow – this gives me the feeling we are in fact talking about a lot of money,

Best regards

Axel

 

Whole Genome Sequencing or Exome Sequencing?

Many large scale exome sequencing projects are funded and underway to analyze rare Mendelian diseases. This technology is often the choice as it is more affordable than whole genome sequencing (WGS) and therefore allows analyzing more patients. In addition it has the advantage that resulting data volumes are much smaller and therefore easier to handle.

But – when looking only on those regions targeted by the exome technology – are the results of an exome sequencing experiment really comparable to a WGS experiment?
 
The study from Clark et al., 2011 focused on this question and found that neither of the technologies managed to cover all sequencing variants. When applying 50 million reads for exome sequencing and 35-fold coverage for WGS, the study came to the following results.

- WGS detected between 660 and 4600 SNPs that were not called from the exome sequencing data and
- Exome Sequencing detected between 2600 and 3200 SNPs that were not called from the WGS data.

What can we conclude from this? First, WGS can not and will not replace exome sequencing as due to genome characteristics there will always be regions that are not covered sufficiently for SNP calling. As oligonucleotide designs of available exomes are balanced regarding regions with low coverage, exome sequencing shows higher sensitivity towards these regions. Second, WGS has its value in detecting variants in regions that are not covered by exome enrichment technologies. These are regions where enrichment fails as well as regions that are not present on the current exome designs.

So for covering really all variants it might be worth thinking about doing both experiments in parallel. Both technologies complement each other.