Tag Archives: re-sequencing

How to handle variants in a reference genome

When talking about genome sequencing the human genome project is one of the best known projects. “Building” a reference genome that helps to identify disease-causing mutations is only one of many goals for the human reference genome.

But I am sure that all of you already asked the question: how can a reference genome even exists? On earth we have more than 7 billion people and among that many different characteristics. So how can one human reference serve for all mankind?

The Global Alliance, lead by David Haussler, recently won a $1 million grant to create a graphical model of the human genome (BioTechniques). The graph model should help to visualise variants as alternate pathways. Like that a more comprehensive picture of “naturally occuring variants” and disease causing variants might be gained. To support this approach, they got access to 300 complete human genome sequences from the Broad Institute in Cambridge.

From my point of view this is a great idea and I hope it helps to further pave the way how the massive amounts of sequencing data can be handled and interpreted in the near future!

Read the complete article at BioTechniques.com

 

Whose Genome Has Been Sequenced? Theobroma Cacao L.

de-novo-sequencingI suppose there is no human being on the planet not knowing chocolate. “The tropical Theobroma cacao tree has been cultivated for at least three thousand years. Its earliest documented use  is arount 1100 BC (wiki.org).”

The latest de novo genome sequencing publication about a cacoa plant focusses on the Theobroma cacao L. Matina 1-6 clone, which is the most common cultivated type of cacao worldwide (Motamayor et al.). And although a first draft of this clone has already been published in 2010 the authors aim for an improved version of the genome to identify candidate genes regulating traits.

What was sequenced?

Leaves from Theobroma cacao L. Matina 1-6 clone; haploid genome size ~0.5 Gbp

Sequencing strategy: Whole genome sequencing plus BAC & fosmid end sequencing

  1. Libraries: shotgun and 8 long paired-end (LPE) libraries (insert size: 3 kbp; 6 kbp, 8 kbp) on the Roche GS FLX; three fosmid libraries and three independent BAC libraries with Sanger Sequencing
  2. Read output: > 32 million reads
  3. Data output: 711 scaffolds with a total scaffold length of 346 Mbp with a contig N50 length of 84.4 kbp and a scaffold N50 length of 34.4 Mbp
  4. Bioinformatics: Beside other tools Arachne, Megablast and blastx were used for genome assembly

Gene annotation and orthology analysis

  1. Libraries: long normalised libraries sequenced on the Roche GS FLX and short-paired reads libraries sequenced on the Illumina platform
  2. Read output: ~ 7 M reads from the Roche and ~ 1 billion reads from the Illumina sequencing
  3. Bioinformatics: Transcriptome assembly using the NCBI TSA within BioProject 51633 & final refining using PASA. Further tools where used for marker identification and comparison to other plant species

As further analysis tools re-sequencing as well as qPCR expression analysis were performed to finally  report a “high-quality sequence and annotation of T. cacao L.  and demonstrate its utility in identifying candidate genes regulating traits.” (Motamayor et al.)

From my point of view this is a high complex study using a comprehensive range of sequencing technologies. This shows once more that not only one sequencing strategy is needed to fully characterise a genome and start interpreting its secrets.

Read the complete publication here.

Whose Genome Has Been Sequenced? – Recent posts:

MiSeq – soon in its full bloom?

Rather than resting on the successes of the MiSeq launch, Illumina is continuously improving the performance of their small Benchtop Next Generation Sequencing System. Geoff Smith is talking about improvements in the read lenght of the MiSeq in the video attached.

This instrument is not only another next gen sequencing device but really has remarkable advantages over other instruments when sequencing for example bacterial genomes. This is why I am really delighted that we now can offer services using the MiSeq instrument. (More info on our MiSeq services can be found here)