Tag Archives: Transcriptome sequencing

Transcriptome assemblers put to the test

Next Generation Sequencing produces millions and billions of reads – and the interpretation of this reads rely on bioinformatic tools.

Especially for de novo assemblies of genomes or transcriptomes the result can vary dependent on the quality of the assembly.

In a recent publication Shorash Amin and his co-workers sequence the transcriptome of the non-model gastropod Nerita melanotragus with the Ion PGM. Afterwards they used different softwares and compared the quality different assemblies of the transcriptome (Amin et. al).

Oases, Trinity, Velvet and Geneious Pro, were the four de novo transcriptome assemblers that were used for this study. The assemblers were compared on different parameters like the length of the contigs, N50 statistics, BLAST and annotation success.

The longest contig was created with the Oasis assembler (1700 bp) and overall Trinity and Oasis delivered much better results than the de novo assembly of Ion PGM reads with Velvet or Geneious Pro.

Furthermore the mapping to a reference genome showed that Ion PGM transcriptome sequencing and subsequent de novo assembly with either Trinity or Oasis generates reliable and accurate results.

Read the complete publication here.

100,000, 40,000, 25,000, 19,000 – the shrinking human genome…

DNAFor sure many of you remember old textbooks, in which the total number of genes in the human genome was estimated around 40,000 to 100,000. After the human genome was sequenced this number shrunk to 26,000 – 40,000 genes. The 19th GENCODE release further reduced this number to 20,318 protein-coding genes. But not enough a recent study suggested that the actual number of protein-coding genes in humans lies around 19,000.

This astonishing result could be obtained by analyzing the data derived from seven large MS-based proteomics studies from more than 50 human tissues.

But the shrinking number of genes is not the only remarkable results – find below the most important results from this study as described in a recent ScienceDaily blog post:

  • Close to 12 000 human genes could be unambiguously identified
  • Despite high coverage from seven analyses, 40% of the peptides from the human gene set could not be detected; Possible reasons:
    • Thousands of genes annotated in the human genome did not appear in the proteomics analysis.
    • Apparently 1,700 genes that were previously thought to produce proteins most certainly don’t
  • Another hypothesis is that more than 90% of human genes produce proteins originating in metazoans or multicellular organisms living hundreds of millions of years ago
  • The difference between humans and primates at the gene and protein level is very small
  • “The number of new genes that separate humans from mice may even be fewer than 10”
  • Physiological and developmental differences between primates are more likely caused by gene regulation than by differences in the basic functions of proteins in question

Alfonso Valencia, the main researcher behind this project states that “the human genome is best annotated, but we still believe that 1,700 genes may have to be re-annotated”.

According to Alfonso Valencia these results may redefine the entire mapping of the human genome.

The Common Marmoset as a Model Organism for the Study of Drug Metabolism

marmosetSeveral non-human primates including Macaca mulatta and Macaca fascicularis are well known as experimental animals in the field of neuroscience, stem cell research, drug toxicology, and other applications. The common marmoset (Callithrix jacchus) is also a non-human primate and is suitable as experimental animal because of the small size and highfecundity.

For developing a drug metabolism model, our collaborators and Eurofins Genomics (2014) performed transcriptome analysis of the common marmoset using in parallel long-read technology (Roche GS FLX+) and short-read sequencing (Illumina HiSeq 2000). This parallel NGS approach resulted in both, the identification and the quantitative analysis of transcripts and thus giving insight into gene expression during drug metabolism. Finally we obtained rich information about genes involved in drug-metabolism including 18 cytochrome P450- and 4 flavin-containing monooxygenase -like (FMO) genes, and their tissue-specific expression patterns.

The results of this study are the foundation for future studies not limited to drug metabolism & pharmacokinetics.

Transcriptome Sequencing In Translational Oncology Research

By using novel microfluidic tools, a team of researchers at Indiana University School of Medicine uncovered an unexpected ability of cancer cells to navigate and exit microscopic mazes along the shortest path. To explain this behavior, they propose a novel mechanism that guides cancer cell migration.

Find out how they have harnessed RNA-seq on tumor tissues to reveal efficacious drug targets and implement rational drug combinations in triple-negative breast cancer. Further, ongoing work on how RNA-seq is being used for biomarker discovery in retrospective cancer clinical trials will also be presented.

1,000 Fish Transcriptome Project

The China National Genebank (CNGB) announces the official launch of the 1,000 Fish Transcriptome Project (Fish T1K). It marks the beginning of an amazing transcriptome study designed to unveil the mysteries of the origin, evolution, and diversification of the largest group of vertebrates.

The findings could enable scientists to pursue innovative approaches and strategies to address challenges in fish breeding, disease control and prevention, seafood safety and biodiversity conservation.

Read more at BGI news

De Novo Transcriptome of a Model Organism to Study Tissue Regeneration

Newts have an extraordinary ability to regenerate tissues. For example, they can re-grow fully functional limbs after amputation. In addition, regeneration of parts of the central nervous system, the heart, and the lens has been characterized, making them an excellent model organism for studying regenerative processes. However, because of their enormous genome size (10 times that of human), the molecular mechanisms behind this amazing regenerative process are largely unknown.

A research group at the Max Plank Institute recently published a de novo assembly of the transcriptome of the urodelian amphibian Notophthalmus viridescens (Looso M. et al. ). The researchers combined 454, Illumina, and Sanger sequencing data from both normalized and non-normalized cDNA libraries. The resulted transcriptome comprises over 120,000 non-redundant transcripts. Homology search using BLAST led to annotation of 38,000 transcripts. Importantly, they found 800 transcripts, whose protein-coding potential was validated by mass spectrometry, that show no similarity to any know transcripts or show similarity to urodele-specific EST sequences. Some of these transcripts belong to novel protein families.

It is an interesting hypothesis that some of those newt-specific proteins may provide mechanistic insights into regeneration processes unique to these animals. Their work will definitely be an important resource for subsequent studies in tissue regeneration and may benefit future research in regenerative medicine.

NGS Favourites – Launch

Dear Blog readers,
today I am really delighted to announce the launch of the NGS Favourites.

NGS Favourites are the straightforward solution for your Next Generation Sequencing project. They are based on the wealth of knowledge that we have accumulated from over 5 years of servicing the NGS community and represent optimised packages for common NGS applications.

The NGS Favourites stand out due to:

  • Project-oriented solutions
  • Economic costs
  • Easy ordering

The NGS Favourites are available for different fields of applications:

  • Genome Sequencing Favourites – using shotgun (SG) libraries only or a combination of SG and LPE libraries
  • Transcriptome Sequencing Favourites – receive comprehensive data you can really build on
  • Exome Sequencing Favourites – sequence 6 human Exomes with the Illumina TruSeq Kit
  • Library Service Favourites – get your libraries for GS FLX sequencing prepared from us

Find your suitable Favourite and explore the easy way of sequencing with us as your professional project partner.

Winter Special: Transcriptome sequencing on Illumina HiSeq 2000

As a NGS Winter Special we do offer you a selection of our well-proven portfolio for transcriptome sequencing. De novo sequencing of eukaryotic transcriptomes will provide you a deep insight into the transcriptome without the need for a reference sequence. With the expression profiling package you will gain high resolution information about expression levels of your bacterial genes.


De novo sequencing of 4 or 8 eukaryotic transcriptomes using Illumina HiSeq 2000

4x or 8x Illumina cDNA library (mRNA-Seq protocol)

Sequencing in 1 channel with 2x 100 bp paired end module

4x or 8x de novo assembly of data

Special price: starting from 1,490 € / RNA sample

Expression profiling of 12 bacterial transcriptomes on Illumina HiSeq 2000

12x Illumina cDNA library with rRNA depletion

Sequencing in 1 channel with 1x 100 bp single read module

12x mapping of reads against reference genome, read counting & SNP calling

Special price: 990 € / RNA sample


All Services include library generation, data analysis & data delivery. The Winter Special is valid until 31.12.2011

Read more about our NGS Winter Special >