https://www.rockefeller.edu/news/32087-the-human-genome-is-at-long-last-complete/ Skip to main content Rockefeller University Search Input [ ] Search Submit Button * Our Scientists + Overview + Heads of Laboratories + Tri-Institutional & Adjunct Faculty + Research Affiliates + Postdoctoral Trainees + Independent Fellows + Emeritus Faculty + Faculty Recruitment Meet the scientific leaders who are changing medicine * Research + Overview + Research Areas and Laboratories + COVID-19 Research + Clinical Research and the Rockefeller University Hospital + Clinical Research Studies + Scientific Publications + Interdisciplinary Centers + Technology Transfer + Resource Centers Peek inside our 70 biomedical laboratories * Education & Training + Overview + Graduate Program in Bioscience + Clinical Scholars Program + Chemers Neustein Summer Undergraduate Research Fellowship Program + RockEDU Science Outreach Learn more about our flexible, supportive academic programs * News + Latest News + Philanthropy News + Campus News + Seek Magazine + Rockefeller University Press + For the Press + Rockefeller Publications Learn about the breakthroughs happening every day * Events & Lectures + Upcoming Events + Calendar of Events & Lectures + Academic Lectures & Symposia + Special Events + Facility Rental Hear from the world's leading speakers and thinkers * About + Overview + Awards & Honors + Campus & Community + Executive Leadership + Our History + Board of Trustees & Corporate Officers + Annual Report We've spent 121 years perfecting the bioscience institute * Support Our Science + Overview + Campaign for the Convergence of Science and Medicine + Philanthropy News + COVID-19 Research at Rockefeller + Why Rockefeller is Unique + Ways to Support Rockefeller + Rockefeller University Council + Women & Science + Parents & Science + Experience Science, the Arts, and Culture + Make a Gift Shape the future of biology and medicine Calendar Directory Careers Give Search Toggle Button Search [ ] Submit ! Phase III+: The University is open for expanded research operations; only authorized personnel will be admitted on campus. More info here. News > Science News The human genome is, at long last, complete March 31, 2022 dna base pairs Results of automated DNA sequencing, displayed on the screen of a tablet computer. (Turtle Rock Scientific/Science Photo Library) When scientists declared the Human Genome Project complete two decades ago, their announcement was a tad premature. A milestone achievement had certainly been reached, with researchers around the world gaining access to the DNA sequence of most protein-coding genes in the human genome. But even after 20 years of upgrades, eight percent of our genome still remained unsequenced and unstudied. Derided by some as "junk DNA" with no clear function, roughly 151 million base pairs of sequence data scattered throughout the genome were still a black box. Now, a large international team led by Adam Phillippy at National Institutes of Health has revealed the final eight percent of the human genome in a paper published in Science. These long-missing pieces of our genome contain more than mere junk. Within the new data are mysterious pockets of noncoding DNA that do not make protein, but still play crucial roles in many cellular functions and may lie at the heart of conditions in which cell division runs amok, such as cancer. "You would think that, with 92 percent of the genome completed long ago, another eight percent wouldn't contribute much," says Rockefeller's Erich D. Jarvis, a coauthor on the study who helped develop a number of techniques central to unlocking the final pieces of the human genome. "But from that missing eight percent, we're now gaining an entirely new understanding of how cells divide, allowing us to study a number of diseases we had not been able to get at before." On the shoulders of the HGP The Human Genome Project essentially handed us the keys to euchromatin, the majority of the human genome, which is rich in genes, loosely packaged, and busy making RNA that will later be translated into protein. Left untouched, however, was a labyrinth of tightly wound, repetitive heterochromatin--a smaller portion of the genome, which does not produce protein. Scientists had good reasons for initially deprioritizing heterochromatin. The euchromatic regions contained more genes and were simpler to sequence. Just as a puzzle with distinct pieces is easier to put together than a puzzle composed of similar ones, the genomics tools of the day found euchromatic DNA easier to parse than its repetitive, heterochromatic cousin. As a result, geneticists were left with a sizable hole in their knowledge of what drives some basic cellular functions. The heterochromatic sequences behind centromeres, which lie at the cruxes of chromosomes and conduct cell division, were all marked with longs runs of N for "unknown base" in the human reference genome. The sequences of the short arms of chromosomes 13, 14, 15, 21, and 22 were likewise omitted. "Not even all of the euchromatic genome was sequenced properly," Jarvis adds. "Errors, such as false duplications, needed to be fixed." Then, about ten years ago, scientists began developing new techniques for producing longer sequence reads that filled in gaps in the genomes of humans and other species. One such initiative is the Vertebrate Genomes Project, helmed by Jarvis, which recently produced the first near error-free and near complete reference genomes for 25 animals. "That study was part of an international effort to develop new tools that produce the highest-quality gene assemblies," he says. "Compared to the methods that were used twenty years ago, modern genomics has high-fidelity long reads that are 99.9 percent accurate, better genome assembly tools, and more powerful algorithms that are better at distinguishing similar-looking puzzle pieces from one another." With updated tools and renewed resolve, Jarvis and other scientists were able to help finish what the Human Genome Project started and describe, at long last, a truly complete human genome--its euchromatic regions revised, and its heterochromatic regions on full display. "It's a big deal," Jarvis says. "Every single base pair of a human genome is now complete." Meeting Merfin The flagship Science study was led by the Telomere-to-Telomere (T2T) Consortium, a group of researchers at various academic institutions and NIH. The Jarvis lab's contribution, published in Nature Methods, involved providing tools to help T2T refine messy genome sequences to produce error-free sequences. One of these tools is Merfin, which they used to clean up some of the most difficult sequences in the human genome. "Genomes that we generate in the lab can have many errors in them," says Giulio Formenti, a postdoc in Jarvis' lab who developed Merfin. "If even just one or a few base pairs are wrong, that can have big consequences for the overall accuracy of the genomic sequence." Merfin makes it possible to test the accuracy of a sequence, sensing code that may be out of place and automatically correcting mistakes. Because the technologies that generate modern sequences are more accurate, Merfin is reserved for only the trickiest cases. "Stretches of identical base pairs, such as AAA, are hard for existing technology to assess," Formenti says. "There are often errors in those sequences, even now. Merfin corrects them." Jarvis and Formenti hope that their contribution will not only help tie a bow on the Human Genome Project, but also inform research into diseases linked to the heterochromatic genome--chief among them cancer, which is associated with centromere abnormalities. Cancer cells divide wildly when certain heterochromatic centromere genes are overexpressed, and a complete understanding of the centromere genome may open the door to novel therapies. "We are finally digging into what we once called junk DNA, because we could not understand it or look at it accurately," Formenti says. "We now know that many diseases are linked to structural repeats in the centromere and, now that these sequences are no longer missing from the human reference genome, we can begin to map the origins of these diseases." --------------------------------------------------------------------- Related News April 28, 2021 A case for simplifying gene nomenclature across different organisms Scientists call it oxytocin in humans, isotocin in fish, mesotocin in birds, and valitocin in sharks. But according to a new study, it's all the same hormone--and high time we settled on just one name. * View all news Recent News March 30, 2022 The Board of Trustees has eight new members With a breadth and depth of experience across academia, the pharmaceutical industry, technology, healthcare, and the financial sector, this latest cohort of trustees brings new skill sets and perspectives to the community. March 24, 2022 How bacteria "self-vaccinate" against viral invaders In studying how bacteria respond to viral infection, scientists are learning that their defense strategies cooperate in ways reminiscent of the elaborate immune systems of animals. March 17, 2022 Social psychologist Jennifer L. Eberhardt to be awarded the 2022 Lewis Thomas Prize The author of Biased: Uncovering the Hidden Prejudice That Shapes What We See, Think, and Do will be presented with Rockefeller's prestigious science writing award on April 7. * View all news Erich Jarvis Erich D. Jarvis Professor Investigator, Howard Hughes Medical Institute Laboratory of Neurogenetics of Language --------------------------------------------------------------------- Related publications Science The complete sequence of a human genome Sergey Nerk et. al. Nature Methods Merfin: improved variant filtering and polishing via k-mer validation Guilio Formenti, Arang Rhie, Brian P. Walenz, Francoise Thibaud-Nissen, Kishwar Shafin, Sergey Koren, Eugene W. Myers, Erich D. Jarvis, Adam M. Phillippy --------------------------------------------------------------------- Sign up for our monthly newsletter [ ] [ ] [Subscribe] --------------------------------------------------------------------- Media contact Katherine Fenz Media Relations Manager * 212-327-7913 * kfenz@rockefeller.edu * More information for reporters * @RockefellerUniv --------------------------------------------------------------------- More news Browse our recent stories. --------------------------------------------------------------------- --------------------------------------------------------------------- The Rockefeller University 1230 York Avenue New York, NY 10065 212-327-8000 * Contact Us * Directory * Maps & Directions * Clinical Studies * Copyright Complaints * Calendar * Careers * Departments & Services * Campus Forms & Policies * Undergraduate Programs * Graduate Program in Bioscience * For the Press * Student Safety and Sexual Respect Get our newsletter The latest science discoveries delivered monthly to your inbox. Please provide a valid email Close Thanks for subscribing! Close [ ] [ ] [Submit] * Facebook * Twitter * YouTube * Instagram * LinkedIn Copyright 2004--2022 The Rockefeller University. All rights reserved. The Rockefeller University Conflict of Interest in Research