Archive by Author

Whose Genome Has Been Sequenced? Theobroma Cacao L.

de-novo-sequencingI suppose there is no human being on the planet not knowing chocolate. “The tropical Theobroma cacao tree has been cultivated for at least three thousand years. Its earliest documented use  is arount 1100 BC (wiki.org).”

The latest de novo genome sequencing publication about a cacoa plant focusses on the Theobroma cacao L. Matina 1-6 clone, which is the most common cultivated type of cacao worldwide (Motamayor et al.). And although a first draft of this clone has already been published in 2010 the authors aim for an improved version of the genome to identify candidate genes regulating traits.

What was sequenced?

Leaves from Theobroma cacao L. Matina 1-6 clone; haploid genome size ~0.5 Gbp

Sequencing strategy: Whole genome sequencing plus BAC & fosmid end sequencing

  1. Libraries: shotgun and 8 long paired-end (LPE) libraries (insert size: 3 kbp; 6 kbp, 8 kbp) on the Roche GS FLX; three fosmid libraries and three independent BAC libraries with Sanger Sequencing
  2. Read output: > 32 million reads
  3. Data output: 711 scaffolds with a total scaffold length of 346 Mbp with a contig N50 length of 84.4 kbp and a scaffold N50 length of 34.4 Mbp
  4. Bioinformatics: Beside other tools Arachne, Megablast and blastx were used for genome assembly

Gene annotation and orthology analysis

  1. Libraries: long normalised libraries sequenced on the Roche GS FLX and short-paired reads libraries sequenced on the Illumina platform
  2. Read output: ~ 7 M reads from the Roche and ~ 1 billion reads from the Illumina sequencing
  3. Bioinformatics: Transcriptome assembly using the NCBI TSA within BioProject 51633 & final refining using PASA. Further tools where used for marker identification and comparison to other plant species

As further analysis tools re-sequencing as well as qPCR expression analysis were performed to finally  report a “high-quality sequence and annotation of T. cacao L.  and demonstrate its utility in identifying candidate genes regulating traits.” (Motamayor et al.)

From my point of view this is a high complex study using a comprehensive range of sequencing technologies. This shows once more that not only one sequencing strategy is needed to fully characterise a genome and start interpreting its secrets.

Read the complete publication here.

Whose Genome Has Been Sequenced? – Recent posts:

Whose Genome Has Been Sequenced? Hevea brasiliensis

de-novo-sequencingAll of us have at least once been doing experiments in the lab. And so everyone was confronted with latex gloves. And more and more of us developed a kind of latex allergy.

According to Rahman et al. “these allergies are triggered by certain proteins present in Hevea-derived natural rubber (NR). [...] Hevea brasiliensis (Willd.) Muell.-Arg., also known as Pará rubber tree, is the primary commercial source for natural rubber (NR) production” (in total nearly 11 million tons in 2011 for all 2,500 rubber tree species).

Although rubber is used for > 50.000 products worldwide this is the first de novo sequencing approach. So far only transcriptome analysis studies were performed, which lack the non-coding regions of the genome.

What was sequenced?

Young leaves of Hevea brasiliensis RRIM 600. Genome size: ~ 2.15 Gb; 18 chromosomes

De novo sequencing strategy:

  1. Libraries: shotgun and mate-pair libraries (insert size: 500 bp) on HiSeq 2000; LPE libraries (insert sizes: 8 kb and 20 kb) on Roche GS FLX; Paired-end library (insert size: 2 kb) on SOLiD
  2. Coverage of all sequencing strategies together: ~ 43x (after filtering repeat-matching reads: ~ 13x = 27.86Gb)
  3. Data output: 143 scaffolds (total 1.119 Mb with N50 = 2.972 bp)
  4. Bioinformatics: CLC Workbench & Newbler assembler using different input data and different assembling strategies

Transcriptome sequencing strategy:

  1. Libraries: cDNA libraries
  2. Sequencing with Illumina HiSeq and Roche/454
  3. Bioinformatics: CLC Workbench assembler for the Illumina reads and Newbler for combining Roche and Illumina reads.

This de novo genome sequencing approach revealed that ~ 78% of the genome are repetitive regions. This study helps to improve breeding of H. brasiliensis by allowing marker assisted selection to further increase the disease resistance and minimize the allergenicity.

Read the complete publication here.

Whose Genome Has Been Sequenced? – Recent posts:

 

 

Acquisitions And Rumours

The last time I wrote about acquisitions is a while ago. But that does not mean that nothing happened – yet the opposite is the case: the NGS business is so dynamic that I am not sure which news are already outdated one day later.

But now it might be time to have all news comprised in this blog to at least list the lastest mergers, acquistions and rumours in the field of NGS:

 

1. QIAGEN acquired Ingenuity for $105M

QIAGEN one of the market leaders in Sample & Assay Technologies now builds up a branch in Next Generation Sequencing. The 1st step was the acquisition of Intelligent Bio-Systems in 2012. The expected launch of the upcoming sequencing device is scheduled for mid 2013. The acquisition of Ingenuity now seems to be last piece of the jigsaw for a complete NGS workflow from sample preparation to complete data analysis (see PR QIAGEN). From my point of view I am really confident regarding the sample preparation and the data analysis. But some doubts remain in respect to the NGS device – at least I have never heard about it before….

1. Life Tech – in great demand

Just two days ago an article in GenomeWeb revealed that two other bidders for Life Tech were Roche and Sigma-Aldrich. The rumours I heard so far only said, that Roche was interested in IonTorrent to push their own NGS business. According to the respective GenomeWeb article the Thermo – Life deal is anticipated to be completed in early 2014.

3. Roche – expanding or downsizing?

But although some rumours say that Roche is still interested in IonTorrent it might also be that they will shift their focus. Especially since Roche has downsized their effort in Applied Science business. According to the announcement Roche will integrate these products with other units and they also stopped the collaboration with DNAe to develop a semiconductor sequencing platform. Maybe because a new development might take too long. Maybe because the deal for IonTorrent is under way…

And while writing the summary I remembered again why these updates are so difficult to phrase: I don’t get rid of the feeling that something new, something more interesting is already close to publication.

Whose Genome Has Been Sequenced? Latimera Chalumnae

de-novo-sequencingThe third de novo sequenced genome in our series Whose genome has been sequenced? is the “living fossil” Latimera chalumnae.

The most difficult part for this de novo genome sequencing approach was to get enough starting material. The authors even reported that their first approach was to use the Sanger technology, but is simply was not enough DNA available. Therefore they had to wait until the next generation sequencing techniques were stable enough to risk the sequencing (BioTechniques). Here are the sequencing facts of this study (Amemiya et al.):

What was sequenced?

A blood sample from an adult African coelacanth

De novo sequencing strategy:

  1. Libraries: shotgun library 61-fold coverage; 3 kb jumping library – 88-fold coverage, 40 kb fosmid library 1-fold coverage
  2. Illumina HiSeq 2000 (paired-end module)
  3. De novo genome assembly using the software ALLPATHS-LG
  4. RNA sequencing

RNA-Seq sequencing strategy:

  1. 4 cDNA libraries (1x mRNA-Seq library, 3x strand specific dUTP libraries from brain, gonad/kidney, gut/liver tissue) were sequenced using a HiSeq
  2. Data output: mRNA-Seq library ~ 210M paired-end reads;  dUTP libarires ~ 3-4 Gb of sequence/tissue
  3. Assembly was performed using Trinity

The genome sequencing helped to understand the possibility of this prehistoric fish to thrive on dry land and the phenotype that is so similar to 300 million year old fossils (BioTechniques).

Read the complete publication here.

Earlier published genomes:

Whose Genome Has Been Sequenced? Cicer Arietinum

de-novo-sequencingWith this new bi-weekly series we would like to highlight some if not all genomes that have been sequenced in the last 6 to 12 months. And at this point of time I am still uncertain if the diversity of organisms and species will be the “eye-opener” or the different technologies and strategies that have been used…

We started this series off in January where we reported about the de novo sequencing of the domestic goat Capra hircus.

Today I would like to report about a plant genome, the Cicer arietinum:

According to the GenomeWeb article this de novo genome sequencing approach is only the 3rd one for crop legume plants. For me that is kind of astonishing since breeding and optimisation of crop is already done since years. Maybe this is due to the huge genomes of plants that outperform animal genomes by far. For our chickpea plant with 740 million base pairs we talk about a medium size plant genome. But let’s focus on the sequencing approach for now (Varshney et. al):

What was sequenced?

De novo sequencing of one reference chickpea plant and re-sequencing of 90 cultivated & wild chickpea lines from 10 different countries

Sequencing strategy:

  1. De novo genome sequencing on HiSeq 2000 (paired-end module) of 1 genome with 11 shotgun and mate-pair libraries (insert sizes: ~ 170; 500; 800; 2,000; 5,000; 10,000; 20,000 bp) and BAC end sequencing
    Data output: 153.01 Gb; after filtering & correction steps only 87.65 Gb data were used for de novo assembly
  2. Re-sequencing of genomes
    • Whole genome re-sequencing on 29 varieties using Illumina 100 bp paired-end sequencing on HiSeq 2000
    • RAD-sequencing of 61 genotypes on HiSeq 2000 (48x ApeKI; 24x HindIII)

According to D. Cook “the sequencing of the chickpea provides genetic information that will help plant breeders develop highly productive chickpea varieties that can better tolerate drought and resist disease — traits that are particularly important in light of the threat of global climate change”. (Davis Enterprise).

Read the complete publication here.

800 bp Read Length For Amplicon Sequencing Is Not Science Fiction

Amplicon sequencing with Roche GS JuniorAbout a year ago my colleguage Regina reported about the new possibilities of using the MiSeq system for amplicon sequencing (16S Amplicon Experiments: Which Platform to Choose?). Now, one year later still everything is true about the advantages of amplicon sequencing using the MiSeq (e.g. lower cost/base).

The main advantage of the Roche system are the long reads that are highly valuable for some applications. By ligating appropriate sequencing adaptors we can currently deliver average read length of up to 700 bp when using the GS FLX+ pipeline. Further improvements regarding the read length can be expected with the launch of a new amplicon pipeline from Roche for the Roche GS FLX+ system (planned for summer 2013).

And beside the ultra long reads on the GS FLX+ system there are still some advantages of amplicon sequencing using the GS Junior system compared to other technologies:

+ short turnaround time (starting from 5-10 working days)

+ competitive pricing

+ moderate to long reads (350 – 450 bp)

+ sufficient data output for all projects with a medium size of samples (e.g. up to 24)

What is your preferred next generation sequencing technology for amplicon sequencing? Take part in our current poll.

Survival Of The Fittest – NGS Library Prep Methods

276_7698_RT8-Vorschau30 years of PCR in various applications has revolutionised molecular biology. But PCR also has its drawbacks. One of them is the amplification of AT- or GC-rich DNA fragments. Naturally, researchers are often interested in sequencing and studying genomes with high GC or high AT content, like S. aureus with a AT content of 67% or Streptomyces coelicolor with a GC content of 72%.
But more and more NGS kit providers try to circumvent PCR in the library prep. Ashley Yeager has summarised the current status of PCR-free library preps including a comprehensive overview of the pro’s and con’s of both methods (BioTechniques).

Summarising the findings from Mrs. Yeager there is no clear champion in sight:

Library prep by using PCR methods
+ well-known lab procedure & good sequencing efficiency
- difficulties in amplifying GC- / AT-rich regions -> sequencing is biased


PCR-free library prep

+ good sequence read distribution & a more even genome coverage
- huge amounts of starting material needed & sequencing reaction is less efficient

Read the complete article under BioTechniques.

AROS AB – now a member of the Eurofins group

AROS Applied Biotechnology A/S
With today’s press release I am happy to announce that AROS Applied Biotechnolgy A/S  is now a member of the Eurofins group.

Here is a short introduction of our new colleagues from AROS:

  • AROS was founded in the year 2000
  • AROS started as a spin off of from the Aarhus University Hospital and was the first service provider for Affymetrix in Europe
  • AROS is based in Denmark and provides a long term experience in sample preparation, microarray analysis and next generation sequencing (NGS)
  • Nowadays AROS has a leading position in NGS service for pharmaceutical research
  • AROS is an Illumina reference lab for next generation sequencing
  • The main focus in NGS is RNA-Seq and exome sequencing that is accomplished with the exome designs of the leading provider in this area (Illumina TruSeq Exome Enrichment, NimbleGen EZ Capture & Agilent SureSelect)

“AROS is an excellent fit […] with our focus on high-quality next-generation sequencing […]” (Dr. Gilles Martin) and therefore I am confident that this new alliance will help us both in further expanding our experience level in NGS and to benefit from our complementary strength.

I am sure you will hear more about the activities from AROS on our blog and hope you join me in welcoming AROS as a member of Eurofins.

Cardiologists are the next target group

Opinions differ as to whether next generation sequencing is already mature enough to be a useful tool in diagnostic routine.

Below you can find an interview of the cardiologists from the university of Heidelberg about their studies to integrate next generation sequencing into a diagnostic tool. Therefore they do collaborate with Siemens to receive best possible results that can be used by the doctors in the same way as current reports from other technologies.

Goat Genome Sequenced Using Whole Genome Mapping

Domestication of goats happened already thousands of years ago. Nowadays they are also used as models for biomedical research. However, one thing was still missing: a reference genome. Researchers from China could now close this gap by successfully sequencing the genome of a domestic goat.

To reveal the secrets of the goat genome the researchers applied a hybrid approach of Illumina shotgun sequencing and whole genome mapping (WGM) using the Argus system from Opgen. As a result, the number of scaffolds could be reduced from 2,090 to 315. This demonstrates that whole-genome mapping for large genomes can be a replacement for traditional genetic maps for de novo assembly (Dong et. al).

This reference genome can now be used for mapping reads of other goats to identify SNPs and other variants that could play a role for breeding, cashmere fiber prodcution or different goat behaviours (Dong et. al).

If you are interested in more information about optical mapping, read our dedicated blog posts: What is optical mapping? and Creating the perfect genome assembly.