Whose Genome Has Been Sequenced? Theobroma Cacao L.

de-novo-sequencingI suppose there is no human being on the planet not knowing chocolate. “The tropical Theobroma cacao tree has been cultivated for at least three thousand years. Its earliest documented use  is arount 1100 BC (wiki.org).”

The latest de novo genome sequencing publication about a cacoa plant focusses on the Theobroma cacao L. Matina 1-6 clone, which is the most common cultivated type of cacao worldwide (Motamayor et al.). And although a first draft of this clone has already been published in 2010 the authors aim for an improved version of the genome to identify candidate genes regulating traits.

What was sequenced?

Leaves from Theobroma cacao L. Matina 1-6 clone; haploid genome size ~0.5 Gbp

Sequencing strategy: Whole genome sequencing plus BAC & fosmid end sequencing

  1. Libraries: shotgun and 8 long paired-end (LPE) libraries (insert size: 3 kbp; 6 kbp, 8 kbp) on the Roche GS FLX; three fosmid libraries and three independent BAC libraries with Sanger Sequencing
  2. Read output: > 32 million reads
  3. Data output: 711 scaffolds with a total scaffold length of 346 Mbp with a contig N50 length of 84.4 kbp and a scaffold N50 length of 34.4 Mbp
  4. Bioinformatics: Beside other tools Arachne, Megablast and blastx were used for genome assembly

Gene annotation and orthology analysis

  1. Libraries: long normalised libraries sequenced on the Roche GS FLX and short-paired reads libraries sequenced on the Illumina platform
  2. Read output: ~ 7 M reads from the Roche and ~ 1 billion reads from the Illumina sequencing
  3. Bioinformatics: Transcriptome assembly using the NCBI TSA within BioProject 51633 & final refining using PASA. Further tools where used for marker identification and comparison to other plant species

As further analysis tools re-sequencing as well as qPCR expression analysis were performed to finally  report a “high-quality sequence and annotation of T. cacao L.  and demonstrate its utility in identifying candidate genes regulating traits.” (Motamayor et al.)

From my point of view this is a high complex study using a comprehensive range of sequencing technologies. This shows once more that not only one sequencing strategy is needed to fully characterise a genome and start interpreting its secrets.

Read the complete publication here.

Whose Genome Has Been Sequenced? – Recent posts:

Stephanie Engel

About Stephanie Engel

Stephie's motto: NGS rules. She is thrilled by molecular diagnostics.

No comments yet... Be the first to leave a reply!