PacBio launches new chemistry and software

In a press release Pacific Biosciences announced the latest enhancement for the PacBio RS II single molecule DNA sequencer. The latest release of the polymerase 6 and chemistry 4 (P6 – C4) version in combination with improved software enhances the performance and output of the platform by 45%. The average read length is now 10,000 – 15,000 bases and up to 40,000 bases for the longest reads. Depending on the nature of the DNA a single SMRT cell will deliver 500 million to 1 billion bases.

The new chemistry will replace the current P5 – C3 chemistry and is recommended for all SMRT sequencing applications.

This new release also includes improvements to the SMRT Analysis software suite for long amplicon analysis and the Iso-Seq™ method. Together with chemistry enhancements, these advances boost accuracy, speed up analysis, and support sequencing of multiplexed amplicons of different sizes.

FacebookTwitterGoogle+Share

Do you want to share your biggest secret?

people_09Should we all get our genome sequenced? And share the information? Just today I read two articles in GenomeWeb regarding human genome sequencing. With, to my opinion, opposite views regarding sharing information from human genomes.

The first article is about the 23andMe project: Here two different groups of people said, that with the functionality “check for close relatives” box they ended up in real crisis in their family. In one case the parents divorced since the close relative box showed that the husband had already a child with another women (prior this marriage). And in the other case a girl found out that she has a brother, whom her mother has giving up for adoption.

So for me this is a clear indicator that simply sharing the genome information might really cause more problems than it can solve.

Exactly the opposite is asked for by George Church. From his point of view for eradicating diseases, creating unlimited energy sources and so on a public access to as many genomes (human and non-human) as possible is a prerequisite.

And I think I could agree to that partially, if we talk about bacteria or plant genomes. But I think we are not ready for a wide sharing of human genome information.

What also became clear to me is that we are not a lot further, than 2 years ago (Genomics – A Curse Or A Blessing?).

Genome sequencing identified Jack the Ripper

It is very likely, that the murders from Jack the Ripper are by far the best-known crime series in the world. The London police had six key suspects for the murders and one of them now could be identified as the killer (MailOnline).

The piece of evidence that was used to identify the murderer was a shawl found be one of the victims, that contained DNA from the victim as well as from the suspect. Using a whole genome sequencing approach, Dr. Louhelainen and his group extracted the 126-year-old DNA and compared it with descendants of the suspect. Read the complete article at DailyMail Online.

m4s0n501

Are you ready to have your genome sequenced?

Genome sequencingLast month we asked if you would be interested in sequencing your genome. If the costs would be lower, the majority said “YES”.

More than 20% answered that their genome has already been sequenced. Personally, I would be very interested to know what they did with the data output.

 

If you are one of the guys who voted “I already have” please submit a comment why you decided to have your genome sequenced.

Whose Genome Has Been Sequenced? Belgica antarctica

de-novo-sequencingExtreme conditions require extreme actions. And this is what the midge Belgica antarctica has done. The midge lives exclusively in the Antarctic and in order to survive shrinked its genome to the smallest possible size. As of today, this is the smallest insect genome that has been sequenced.

Kelley et. al. now sequenced the genome of Belgica antarctica with the aim to learn more about how insects in general can adapt to the most extreme conditions.

What was sequenced?

Two fourth instar larva (Belgica antarctica) collected near Palmer Station, Antarctica.

Sequencing strategy: Whole genome sequencing & RNA-sequencing

  1. Libraries & Sequencing: 1 channel 2x 100 bp Illumina HiSeq 2000 (SG library (400 bp insert)) and one SMRT-cell of a 10 kb fragment library on PacBio RSII (P4 DNA Polymerase)
  2. Data output: 92 M paired-end reads from the shotgun sequencing with Illumina. These resulted in 5,422 contigs. Using the paired-end RNA-Seq data the number of contigs has been reduced to 5,064. Genome coverage with Illumina sequencing ~ 100x.
  3. Results: The total genome is ~ 99 Mbp.

For the PacBio sequencing a second larvae was used. But due to the low input of genomic DNA the PacBio data yielded only in a modest improvement in assembly. This underlines the need of a long-read sequencing technology with low input DNA material.

The de novo sequencing of the midge Belgica antarctica revealed that the smalll genome size is achieved by a reduction in repeats, TEs and intron size.

Read the complete publication here.

Whose Genome Has Been Sequenced? – Recent posts:

A major upgrade of SAMTools: CRAM format to reduce NGS data load

SAMTools, one of the most popular NGS sequence analysis tools has recently been upgraded by Computer scientists at the Wellcome Trust Sanger Institute. SAMTools is a set of utilities which allow the manipulation of alignments in the SAM/BAM format. SAM is the acronym for Sequence Alignment/Map format, whereas BAM is just the binary form of SAM. SAM can be seen as the worldwide standard for storing large nucleotide sequence alignments.

SAMTools 1.0, the revised version of the free program suite now allows researchers an improved handling of their sequencing data. Further to the existing SAM and BAM file formats, SAMTools now supports the new CRAM format. Basically, CRAM files are alignment files, just like BAM files – except that their size is reduced by 10 -30%. For better handling even greater compression – up to 100-fold – can be achieved in the “lossy” mode, that still preserves the most important information. The savings in storage that CRAM offers could be achieved by incorporating data compression techniques which were cooperatively developed by the Sanger Institute and the EMBL-European Bioinformatics Institute.

“This major rebuild of SAMTools reflects our commitment to supporting the global use of sequencing data,” says Dr Richard Durbin, Head of Computational Genomics at the Sanger Institute. “Genome science worldwide relies on fast and efficient data analysis and storage, and SAMTools 1.0 fulfills this need by supporting new sequencing and analysis technologies”. Dr. John Marshall from the Sanger Institute is highly optimistic that the widespread uptake of the new format will lead to lower data storage costs on a global scale (complete article).

I am curious on how the new format is going to be adapted by the genomic community. By the way did you know that SAMTools has been downloaded more than 225,000 times?

Quality Before Quantity?

The  “$1.000 genome” is to my knowledge the buzzword everyone knows when thinking about Next Generation Sequencing.

And quite often I ask myself: What will be the future of NGS? Whole genome sequencing of everyone and everything?

I am confident that this is part of Illumina’s  strategy for their new HiSeq X Ten instruments – at least for humans.

Contrary there is still all the data that needs to be analysed. And an interview with Lex Nederbragt highlights that data analysis is still a bottleneck. Also the latest report from Markets&Markets for whole exome sequencing predicts a strong growth in targetd sequencing. They estimate a growth for whole exome sequencing  from $326.6 M in 2013 up to $884.1 Million by 2018.

So will quality, like sequencing distinct regions outcompete the $1,000 genome? What are your thoughts about that?

Large genome sequencing studies in the USA

senior-asian-woman-100226669The launch of the Illumina HiSeq X Ten enabled them to put in practice their plans and great visions. I’m speaking of two persons of great influence in the United States. This year, Dr. J. Craig Venter, known for being one of the first to sequence the human genome and Patrick Soon-Shiong, considered as the world’s richest doctor, both revealed some details about their large scale sequencing projects:

  • J. Craig Venter founded the company Human Longevity that aims to develop treatments for cancer and age-related conditions. To unveil the mechanisms how people can live long and healthy lifes, the company will become one of the largest DNA sequencing facilities in the world. The plan is to set up a sequencing center that is capable of sequencing 40,000 human genomes a year.
  • Patrick Soon-Shiong recently announced that his company NantHealth is purchasing sequencers being able to sequence 22,000 genomes annually. The samples will derive from the 22,000 patients diagnosed with cancer annually at the 34 hospitals owned by Providence Health & Services. Consequently, the company from Soon-Shiong probably will become the first one using their sequencing capacity for clinical sequencing on a large scale.

Such huge collections of sequencing data make it possible to uncover the molecular causes of an complex process as aging or such a diverse and complex disease as cancer in a general approach. Big and very valuable databases will be created, that may contribute to develop new pharmaceuticals or develop personalized therapies.

The future of miRNA analysis

We asked you in which technology you see the future of miRNA analysis.
Find here the voting of the 102 participants:

miRNA

 

Unexpected Heroes

Image courtesy of FreeDigitalPhotos.netThere are several mutations known which are linked to childhood diseases. This knowledge is already being used e.g. to analyze genomes of sick newborns for any known diseases, or for prenatal diagnostics. However: A person carrying such a mutation must not necessarily get ill.

Some individuals carry a mutation that should have caused a severe disease in their childhood. However, some yet unknown factors have protected them from getting ill. Even though they may be very rare, studying such persons may help to understand more about the diseases, or even find new treatments.

Researchers of the “Resilience Project” are now looking for such individuals who they call “unexpected heroes”: Adults who are “resilient to a certain rare disease despite carrying genetic mutations that would indicate onset of the disease in childhood.” In order to find those rare individuals, they are asking for volunteers to donate DNA samples for the project. Since they expect only 1 of 20,000 individuals to be such an “unexpected hero”, they need to analyze the genomes of more than 100,000 individuals. Participants can register online and will receive a test kit by mail. In return, the volunteers get a report indicating whether any of the analyzed mutations have been found in his or her genome.

The researchers hope to identify genes that can “buffer” the effects of the mutations, as well as environmental factors which help people carrying the mutations to stay healthy. The goal is to find new treatments, or even prevent people from getting ill at all.