Whose Genome Has Been Sequenced? Theobroma Cacao L.
I suppose there is no human being on the planet not knowing chocolate. “The tropical Theobroma cacao tree has been cultivated for at least three thousand years. Its earliest documented use is arount 1100 BC (wiki.org).”
The latest de novo genome sequencing publication about a cacoa plant focusses on the Theobroma cacao L. Matina 1-6 clone, which is the most common cultivated type of cacao worldwide (Motamayor et al.). And although a first draft of this clone has already been published in 2010 the authors aim for an improved version of the genome to identify candidate genes regulating traits.
What was sequenced?
Leaves from Theobroma cacao L. Matina 1-6 clone; haploid genome size ~0.5 Gbp
Sequencing strategy: Whole genome sequencing plus BAC & fosmid end sequencing
- Libraries: shotgun and 8 long paired-end (LPE) libraries (insert size: 3 kbp; 6 kbp, 8 kbp) on the Roche GS FLX; three fosmid libraries and three independent BAC libraries with Sanger Sequencing
- Read output: > 32 million reads
- Data output: 711 scaffolds with a total scaffold length of 346 Mbp with a contig N50 length of 84.4 kbp and a scaffold N50 length of 34.4 Mbp
- Bioinformatics: Beside other tools Arachne, Megablast and blastx were used for genome assembly
Gene annotation and orthology analysis
- Libraries: long normalised libraries sequenced on the Roche GS FLX and short-paired reads libraries sequenced on the Illumina platform
- Read output: ~ 7 M reads from the Roche and ~ 1 billion reads from the Illumina sequencing
- Bioinformatics: Transcriptome assembly using the NCBI TSA within BioProject 51633 & final refining using PASA. Further tools where used for marker identification and comparison to other plant species
As further analysis tools re-sequencing as well as qPCR expression analysis were performed to finally report a “high-quality sequence and annotation of T. cacao L. and demonstrate its utility in identifying candidate genes regulating traits.” (Motamayor et al.)
From my point of view this is a high complex study using a comprehensive range of sequencing technologies. This shows once more that not only one sequencing strategy is needed to fully characterise a genome and start interpreting its secrets.
Read the complete publication here.
Whose Genome Has Been Sequenced? – Recent posts:



June 17, 2013 
The last time I wrote about acquisitions is a while ago. But that does not mean that nothing happened – yet the opposite is the case: the NGS business is so dynamic that I am not sure which news are already outdated one day later.
30 years of PCR in various applications has revolutionised molecular biology. But PCR also has its drawbacks. One of them is the amplification of AT- or GC-rich DNA fragments. Naturally, researchers are often interested in sequencing and studying genomes with high GC or high AT content, like S. aureus with a AT content of 67% or Streptomyces coelicolor with a GC content of 72%.
Domestication of goats happened already thousands of years ago. Nowadays they are also used as models for biomedical research. However, one thing was still missing: a reference genome.