The Vertebrate Genome Project has set itself the goal of sequencing the genomes of all over 70,000 vertebrates completely and with as few errors as possible within the next ten years. Thanks to numerous technological advances, this goal is within reach. In a series of publications, the international research consortium is now presenting the methodological foundations and at the same time publishing the first 16 model genomes. The high-quality genome data promise new insights and research opportunities for biology and medicine and can also help conserve endangered animal species.
The genome of a living being comprises all of its genetic information, encoded in the sequence of the bases in the DNA. Some sections code for proteins, others have regulatory tasks and for still others the function is still unknown. By deciphering the genetic code of as many species as possible, researchers hope to gain new insights into the architecture of life. Knowing the base sequence does not necessarily mean understanding its function. But reliably sequenced genomes provide an important basis for further research.
Large international research project
An international research consortium led by Erich Jarvis from Rockefeller University in New York has set itself the goal of providing such genome data for all over 70,000 vertebrate species in the world. In a series of publications, the researchers are now presenting the methodological and content-related results of the Vertebrate Genome Project to date. On the one hand, they describe which sequencing techniques are particularly suitable for decoding the genomes as completely and error-free as possible. On the other hand, they present the first 16 reference genomes of mammals, birds, reptiles, amphibians and fish.
The researchers laid an important basis for the further progress of the project by refining existing technologies for genome sequencing. “Strictly speaking, it was not just about finding the best sequencing technology, but rather the best combination of technologies,” explains co-author Axel Meyer from the University of Konstanz. To do this, the researchers first tested various methods by sequencing the Anna’s hummingbird genome several times in different ways and comparing which method produced the most accurate results.
Combination of methods for better results
Based on these experiments, the Vertebrate Genome Project recommends combining procedures with long and short genetic segments. So-called “long reads” provide an overview of the genome sections, but are inaccurate in detail. In order to compensate for the errors, very short sections, so-called “short reads”, are also analyzed. These provide very precise results, but would only provide fragmentary insights without the general overview. Further procedures and computer calculations then help to correctly assemble the individual sections into chromosomes.
After the Anna’s hummingbird as the first reference genome, the researchers sequenced 15 additional genomes from vertebrate animals from all major taxa using this technology combination. “Despite remaining imperfections, our reference genomes are, to the best of our knowledge, the most complete and of the highest quality available for any species sequenced to date,” the authors write.
Promising findings
The researchers are convinced that the Vertebrate Genome Project will provide the impetus and the basis for completely new questions in research. “These studies mark the beginning of a new era in genome sequencing that will accelerate over the next decade, enable genomic applications across the entire tree of life, and transform the way we scientifically deal with the living world,” says co-author Richard Durbin of from the University of Cambridge.
The genomes sequenced so far have already produced numerous new and sometimes surprising findings. In the genome of zebra finches and platypus, the researchers found entire chromosomes that had apparently been overlooked in previous analyzes. A comparison of the genomes of Marmosett monkeys and humans revealed that while humans and marmosets have many genes in common for brain development and disease, marmosets also have genes whose products would be toxic to humans. This is particularly relevant because the monkeys are increasingly being used in research as model organisms for neurological diseases. “This underscores the need to consider genomic context variations when using marmosets as models in human disease research,” the researchers conclude.
Another result of the previous analyzes gives hope for endangered species: In the genomes of the Kakapo, a flightless parrot from New Zealand, and of the Vaquita, a Californian porpoise, of which fewer than 20 individuals still exist, the researchers discovered that have accumulated far fewer harmful mutations than would have been expected based on the low genetic diversity. “That means there is hope for the species to be preserved,” concludes Jarvis.
Ambitious plans
In the next steps of the large-scale research project, the researchers want to gradually sequence the genomes of all vertebrates. They make their data freely available and invite other research groups to join the project. “In order to realize such a project within ten years, we have to be able to sequence 125 genomes per week without compromising the quality,” the authors say. In view of the progress made so far and expected for the future, they consider this task realistic and promising. “We’re learning a lot more than we expected,” says Jarvis. “This work is a proof of principle for what is to come.”
Sources: Arang Rhie (National Institutes of Health, Bethesda, USA) et al., Nature, doi: 10.1038 / s41586-021-03451-0; Chentao Yang (University of Copenhagen) et al., Nature, doi: 10.1038 / s41586-021-03535-x