Tag Archives: Genome sequencing

Whose genome has been sequenced? Latimera chalumnae

de-novo-sequencingThe third de novo sequenced genome in our series Whose genome has been sequenced? is the “living fossil” Latimera chalumnae.

The most difficult part for this de novo genome sequencing approach was to get enough starting material. The authors even reported that their first approach was to use the Sanger technology, but is simply was not enough DNA available. Therefore they had to wait until the next generation sequencing techniques were stable enough to risk the sequencing (BioTechniques). Here are the sequencing facts of this study (Amemiya et al.):

What was sequenced?

A blood sample from an adult African coelacanth

De novo sequencing strategy:

  1. Libraries: shotgun library 61-fold coverage; 3 kb jumping library – 88-fold coverage, 40 kb fosmid library 1-fold coverage
  2. Illumina HiSeq 2000 (paired-end module)
  3. De novo genome assembly using the software ALLPATHS-LG
  4. RNA sequencing

RNA-Seq sequencing strategy:

  1. 4 cDNA libraries (1x mRNA-Seq library, 3x strand specific dUTP libraries from brain, gonad/kidney, gut/liver tissue) were sequenced using a HiSeq
  2. Data output: mRNA-Seq library ~ 210M paired-end reads;  dUTP libarires ~ 3-4 Gb of sequence/tissue
  3. Assembly was performed using Trinity

The genome sequencing helped to understand the possibility of this prehistoric fish to thrive on dry land and the phenotype that is so similar to 300 million year old fossils (BioTechniques).

Read the complete publication here.

Earlier published genomes: Goat genome (Capra hircus); Chickpea plant (Cicer arietinum)

Whose genome has been sequenced? Cicer arietinum

de-novo-sequencingWith this new bi-weekly series we would like to highlight some if not all genomes that have been sequenced in the last 6 to 12 months. And at this point of time I am still uncertain if the diversity of organisms and species will be the “eye-opener” or the different technologies and strategies that have been used…

We started this series off in January where we reported about the de novo sequencing of the domestic goat Capra hircus.

Today I would like to report about a plant genome, the Cicer arietinum:

According to the GenomeWeb article this de novo genome sequencing approach is only the 3rd one for crop legume plants. For me that is kind of astonishing since breeding and optimisation of crop is already done since years. Maybe this is due to the huge genomes of plants that outperform animal genomes by far. For our chickpea plant with 740 million base pairs we talk about a medium size plant genome. But let’s focus on the sequencing approach for now (Varshney et. al):

What was sequenced?

De novo sequencing of one reference chickpea plant and re-sequencing of 90 cultivated & wild chickpea lines from 10 different countries

Sequencing strategy:

  1. De novo genome sequencing on HiSeq 2000 (paired-end module) of 1 genome with 11 shotgun and mate-pair libraries (insert sizes: ~ 170; 500; 800; 2,000; 5,000; 10,000; 20,000 bp) and BAC end sequencing
    Data output: 153.01 Gb; after filtering & correction steps only 87.65 Gb data were used for de novo assembly
  2. Re-sequencing of genomes
    • Whole genome re-sequencing on 29 varieties using Illumina 100 bp paired-end sequencing on HiSeq 2000
    • RAD-sequencing of 61 genotypes on HiSeq 2000 (48x ApeKI; 24x HindIII)

According to D. Cook “the sequencing of the chickpea provides genetic information that will help plant breeders develop highly productive chickpea varieties that can better tolerate drought and resist disease — traits that are particularly important in light of the threat of global climate change”. (Davis Enterprise).

Read the complete publication here.

Genome Sequencing Analysis of Ash Tree – Supported by £2.4 Million

ash_treeTo conduct genome sequencing and analysis of Ash (Fraxinus excelsior), researchers in the UK received £2.4 million ($3.6 million / €2.8 million). The major aim of this project is to increase the understanding of the wide spreading fungal tree disease, which is widespread in northern Europe and has already been found at more than 300 sites across the UK (see http://www.forestry.gov.uk/chalara). Those fungi attack ash tress but some tress resists those attacks.

For this reason a lot of samples of the ash dieback fungus will be sequenced and – funded by an urgency grant from the Natural Environment Research Council – the complete genome sequence of Ash is aimed to be available by August.

Sequencing of the approximately 900 Mb plant genome will be performed applying the latest hybrid de novo sequencing strategy, recently proven to deliver excellent scaffolding and assembly results. This new golden standard in de novo sequencing employs a combination of Roche/454 FLX++ long read technology (software version 2.8 with read lengths up to 1,100 bp) and Illumina HiSeq 2000/2500 high throughput sequencing with several ultra-accurate long jumping distance libraries (LJD of 3kb, 8kb, 20kb and 40kb), supplemented by sequencing of Illumina shotgun libraries with different fragment sizes.

With the sequenced ash tree genome the researchers hope to hold clues to how some of the trees (2% are able to defend the disease) are able to resist attack, and knowledge about the genetic differences between resistant and non-resistant trees. This knowledge could be used to develop trees that can’t be infected.

Project leader, Dr. Richard Buggs from Queen Mary’s School of Biological and Chemical Sciences: “Sequencing the ash genome is a foundational step towards discovering the genetic basis of resistance to ash dieback – the future of ash trees in Britain may depend on this”.

Read more about that exciting project at GenomeWeb about the general project and at Eurofins MWG Operon about the genome sequencing.

Cardiologists are the next target group

Opinions differ as to whether next generation sequencing is already mature enough to be a useful tool in diagnostic routine.

Below you can find an interview of the cardiologists from the university of Heidelberg about their studies to integrate next generation sequencing into a diagnostic tool. Therefore they do collaborate with Siemens to receive best possible results that can be used by the doctors in the same way as current reports from other technologies.

Goat Genome Sequenced Using Whole Genome Mapping

Domestication of goats happened already thousands of years ago. Nowadays they are also used as models for biomedical research. However, one thing was still missing: a reference genome. Researchers from China could now close this gap by successfully sequencing the genome of a domestic goat.

To reveal the secrets of the goat genome the researchers applied a hybrid approach of Illumina shotgun sequencing and whole genome mapping (WGM) using the Argus system from Opgen. As a result, the number of scaffolds could be reduced from 2,090 to 315. This demonstrates that whole-genome mapping for large genomes can be a replacement for traditional genetic maps for de novo assembly (Dong et. al).

This reference genome can now be used for mapping reads of other goats to identify SNPs and other variants that could play a role for breeding, cashmere fiber prodcution or different goat behaviours (Dong et. al).

If you are interested in more information about optical mapping, read our dedicated blog posts: What is optical mapping? and Creating the perfect genome assembly.

Think Big: American Gut Project Based On NGS

Scientists estimate that the cells of our bodies are outnumbered 10 to 1 by bacterial cells which live in or on our body.  A previous blog has already pointed out the impact of this fact on sequencing the corresponding host genomes. On the other hand, microbiomes have the potential to play an important role as diagnostic markers, or opening up new ways of treating diseases, such as personalized medicine.

However, we are just beginning to understand the complex relationships of this “social network”, as the Scientific American has called it. The most complex bacterial community within the human body resides inside the gut. In order to obtain a deeper understanding of the bacterial communities of the human gut, there have been several attempts of sequencing the gut microbiomes of larger groups of individuals, such as projects by Arumugam et al., Yatsunenko et al or Schloissnig et al. However, so far, the number of individuals which were analyzed was relatively small (up to several hundreds).

A group of US scientists have now started the “American Gut Project“.  As reported by Genome Web News, this project is planned as a crowd-sourcing study of 10.000 or more individuals in the US. Since this study is part of the “American Food Project”, it will mainly focus on gut microbiome patterns in relation to diet, age and lifestyle. People who would like to participate in this study need to sign up via a website and donate $99. This money will be used to cover a significant part of the cost of the study. In return, participants will receive a taxonomic profile of their gut microbiome.

The analysis itself will be based on 16S sequencing. For part of the samples, additional analyses such as sequencing the complete metagenomes and long term surveys are planned. No doubt, this study will clearly provide us with a huge data set. However, this data set will be highly complex. Also, it still needs to be brought in context with data from other projects.  To my opinion, interpretation of the data still remains the hardest part. Or, as project organizer Jeff Leach has put it in an interview with Genome Web Daily News: “We don’t expect to be able to address some questions, but because of the size of the sample and because of the broad patterns we expect to see in diet and lifestyle, we think some stuff will fall out.”

How to benefit from our superior LJD’s on the MiSeq

With the update of our MiSeq system to 250 bp reads genome sequencing on this system gets even more important. But long reads and huge data output are not the only prerequisite for a great de novo assembly result.

What is missing?

Paired-end libraries that span gaps and repetitive structures can improve de novo genome assemblies tremendously. Our proprietary long jumping distance libraries (LJDs) are perfectly suited for scaffolding on Illumina sequencing devices. In contrast to other paired-end libraries (like Illumina mate pair library), our LJD library preparation involves an adaptor-guided ligation of the genomic fragments. The different preparation protocol offers the following advantages:

  • No hybrid reads – a unique sequence identifies the crossover points
  • No shotgun pairs – less than 1% of all LJD reads are shotgun paired-end reads
  • Distinct insert sizes – we prepare LJDs with 3, 8, 20 or even 40 kbp insert size
  • Span large repeats – large and complex repeats up to 40 kbp can be resolved

Mapped reads: All reads from a 3 kbp LJD library (grey) are aligned to a reference sequence. Two LJD read pairs are highlighted (blue + black) and their measured insert size is 3107 bp and 3002 bp respectively.

 

Why should I combine MiSeq long reads and LJDs?

The new features of the MiSeq (250 bp reads; data output up to 8 Gbp) enable the combined and cost-efficient approach of shotgun and LJD libraries in one run. The MiSeq output is sufficient to sequence several bacterial genomes or single fungal genomes (up to 60 Mbp) with appropriate coverage.

  • Longer reads – more sequence information to correctly map the reads onto your contigs
  • Short delivery time – due to the shorter run time compared to the HiSeq 2000

Read more about our long jumping distance libraries on our website

Whole Genome Sequencing of Fukushima’s People

At the end of August, Mr. Hosono, the Japanese minister for the environment, announced, that the ministry aims to perform whole genome sequencing (WGS) of people who live around the disabled “Fukushima Daiichi Nuclear Power Station”. He said that the WGS project will not be able to relieve concerns immediately, but it will make an important provision for the future. According to Mr. Hosono the main target group for WGS will be children.

These genomic analyses face many problems including the aspect of experiments with humans, maintaining confidentiality, discovery of information according to need, and others. This story reminds me once more that NGS technologies start to have social impacts.

NGS goes to the Big Apple

Six floors for Next Generation Sequencing in the middle of Manhatten – this is going to be exiting. Listen to the interview from Bio-IT World with Nancy Kelly, founding executive director of the New York Genome Center.

Genomics – A Curse Or A Blessing?

Is sequencing your personal genome a curse or a blessing? A recent radio broadcast from NPR news summarises two scientist’s opinions and their practical experiences with genome sequencing  (listen to the radio broadcast below).

World renowned scientist James Watson, from the famous Watson & Crick team that discovered the DNA structure, recently sequenced his own genome. His discovery didn’t earn him the next Nobel prize for science, but he found out that he belongs to the elite few people whose body is more sensitive to ß-blockers. Now James Watson finally realized why it was so difficult for him to balance his blood pressure. It definitely paid off for Watson to sequence his own genome since he could significantly reduce his weekly ß -blocker intake. But despite this “health-changing” experience, he forbid his colleagues to reveal any information about his likelihood to develop Alzheimer. He said, “since you cannot cure it why would you like to know about it?”

The second candidate to share his experience after he personally sequenced his genome is Stanford geneticist Michale Snyder. His genome sequencing revealed that he was at high risk to develop Type 2 diabetes. A few months after his discovery, Synder got the disease that his genome anticipated. Was this a coincidence or fate? For Snyder, knowledge about his genome gave him a head start against the disease.  By completely transforming his diet and participating in various sport activities,  he overcame his Type 2 diabetes.

From my perspective, both examples show that knowledge about our genetic information can be useful in preventing and treating diseases. It boils down to how much experience exist to reliably interpret the data.