Tag Archives: coverage

Exome Sequencing – Which Coverage Will Be Sufficient?

exome_sequencingOver the last years, exome sequencing has become a standard application. Every day, huge amounts of data are generated which need to be interpreted. However: Are we sure that our analysis is always showing us the complete picture?

Based on experience, coverage can significantly vary over the entire exome. For this reason, not only the average on-target coverage should be considered, but also the local coverage at a particular site of interest. Otherwise, important information may get lost.

Researchers of the University of Edinburgh and the Wellcome Trust Sanger Institute have carried out a study which was recently published in BMC Bioinformatics. They analysed how sequencing depth relates to sensitivity of SNV detection. They used a set of 30 captured exomes, which had been sequenced to a high depth. As basis for the analyses, they selected a set of verified “gold standard” SNVs for each sample. Then they generated different randomly selected subsets of each data set. In the next step, they called SNVs on the full data sets and the downsampled sets.

From those studies, they estimated that in order to detect at least 95% of the heterozygous SNVs, the local coverage at a given site of interest must be at least 13-fold, while a 3-fold coverage would be sufficient to detect a homozygous SNV. On the other hand, an average on-target coverage of 20fold would result in 5-15% of the heterozygous and 1-4% of the homozygous SNVs to be missed.

They concluded that one does not necessarily have to go for excessively high coverage for exome sequencing, but one should consider how likely a polymorphism could remain undetected.
Actually, the same considerations should be made when looking at whole genome data.

The group has developed software to help researchers check their data. It can be applied to determine the local and overall SNV detection sensitivity of a given data set. The software is available for free download.

What is your experience? Share your expert knowledge with us!

What Strategy and how much Coverage is Needed for Bacterial de novo Sequencing?

In the NGS (Next Generation Sequencing) group of the networking platform LinkedIn I came across a lively discussion about the best strategy for de novo sequencing of a bacterial genome. The discussion was about technology (Roche versus Illumina), coverage (from 10-fold on Roche GS FLX to 500-fold on Illumina HiSeq 2000) and library types. The comments and advices given coincide with our experience that this question can not be answered without any further information on the genome to be sequenced. GC content, amount and size of repeat structures as well as the genome size of the bacterium have to be considered. We have meanwhile de novo sequenced more than 100 microbial genomes and according to our experience GS FLX technology with a combination of shotgun and long paired end libraries will deliver a high quality genome sequence that is suitable for gap closing projects.

The multiple library approach is described in detail in our Application Note and in a Press Release. Depending on the complexity and size of the genome, we select the appropriate library.