After the Humans, chimpanzees and orangutans genome, last week another great ape genome was reported as being sequenced and assembled in Nature: The Gorilla genome.
Gorillas that are in immediate danger of extinction are humans’ closest living relatives after chimpanzees, followed by orangutans. Therefore the genome of the gorilla represents the missing piece of the puzzle, to study the origin and evolution of the humans in much more detail.
The comparison had indeed some surprises in store: It revealed that the gorilla and humans are more closely related to each other than assumed previously. The separation of both species took place approx. 10 million years ago. Approx. 4 million years after that the chimpanzees separated from the humans.
To gain a genome assembly with contigs and scaffolds long enough to allow those comparisons, the international research team, not only sequenced the genome with Illumina short read technology (167 Gbp) but included 5.4 Gbp of long read technology sequencing data in addition. Based on a genome size of approx. 3 Gbp the Sanger reads referred to a coverage of 1.8-fold. The initial assembly was produced with a de novo strategy but in later phases of the assembly the researchers made use of the human reference genome to improve the assembly.
For me personally, the assembly and scaffolding approach described is really impressive. A variety of software tools was used to integrate sequence data and paired-end information from different technologies as well as the similarity to the human genome to best use all the information available. Have a look at it!