[HN Gopher] The human genome is, at long last, complete
       ___________________________________________________________________
        
       The human genome is, at long last, complete
        
       Author : marc__1
       Score  : 316 points
       Date   : 2022-04-02 15:34 UTC (7 hours ago)
        
 (HTM) web link (www.rockefeller.edu)
 (TXT) w3m dump (www.rockefeller.edu)
        
       | csours wrote:
       | Excellent, now we can watch it change!
        
       | colordrops wrote:
       | _The_ human genome, or _a_ human genome?
        
         | [deleted]
        
         | marcosdumay wrote:
         | A single one. But completely mapped.
        
           | amelius wrote:
           | Man/woman?
        
             | Jailbird wrote:
             | Almost certainly.
        
               | mmmrtl wrote:
               | and yet neither "A complete hydatidiform mole (CHM)"
        
             | tomjakubowski wrote:
             | The answer is complicated. The molar pregnancy which CHM13
             | was made from had two copies of one man's X chromosome and,
             | separately, the project sequenced another man's Y
             | chromosome.
             | 
             | https://www.science.org/content/article/most-complete-
             | human-...
             | 
             |  _The genome's Y chromosome came from Peshkin, and the rest
             | of the DNA sequenced by the Telomere-to-Telomere (T2T)
             | Consortium comes from a so-called molar pregnancy, a
             | uterine growth that results on rare occasions when a sperm
             | enters an egg that has no chromosomes. The fertilized cell
             | can copy the sperm's 23 chromosomes, creating two identical
             | sets, and begin to replicate._
             | 
             |  _The question remains open of whether the owner of CHM13's
             | genome could be identified using public DNA sequences in
             | genealogy databases. Phillippy thinks not because CHM13's
             | genome only represents one-half of that person's DNA. Even
             | if it were possible, NHGRI officials argue it would be
             | unethical to reveal him for any reason, including to get
             | consent._
             | 
             |  _Because CHM13 has an X chromosome but no Y, the T2T
             | Consortium added Peshkin's DNA_
        
             | [deleted]
        
       | carbocation wrote:
       | We'll get another batch of "the human genome is complete"
       | articles when we start publishing graph genomes.
        
         | inciampati wrote:
         | I think those will be saying "we have just begun the sequencing
         | of the human pangenome".
         | 
         | The problem is endless. Life on earth is huge.
        
       | syngrog66 wrote:
       | if the planet shuts down today and we all melt into a hydro-
       | carbon haze, then, this will be why. project complete
        
       | cato_the_elder wrote:
       | I feel I've read the same headline quite a few times over the
       | years. Here's one from last year:
       | https://www.theatlantic.com/science/archive/2021/06/the-huma...
        
         | dogline wrote:
         | Yes, and I've had a hard time parsing what's different this
         | time than last time. Anybody?
        
           | ma2t wrote:
           | The Atlantic article is based on a preprint of one of the
           | papers now formally published in Science. They both describe
           | the same T2T-CHM13 assembly.
        
           | ece wrote:
           | There are telomeres and centromeres on the ends and middle of
           | chromosomes that have really long repeating sequences, which
           | until recently were hard to sequence. This sequenced those
           | long sequences, from a fertilized egg that apparently didn't
           | have any female DNA. Next steps are sequencing more
           | individuals. If this sequencing tech gets cheap enough,
           | individualized medicine could take a big leap forward. I hope
           | that's a good summary.
           | 
           | https://www.washingtonpost.com/science/2022/03/31/human-
           | geno...
        
             | ma2t wrote:
             | Has no male DNA (no Y chromosome), and is completely
             | homozygous which simplified the assembly.
             | https://web.expasy.org/cellosaurus/CVCL_VU12
        
               | ece wrote:
               | The cell was female, but due to a quirk, the DNA was all
               | male? Or I'm not understanding what the WaPo article
               | said. The genome does have a Y chromosome: https://www.nc
               | bi.nlm.nih.gov/assembly/GCA_009914755.4#/def
        
               | ma2t wrote:
               | That Y chromosome was added by applying same analysis
               | workflow to a different biological source
               | (CORIELL:NA24385, a NIST standard material used in the
               | Genome in a Bottle project). The other chromosomes are
               | all from the CHM13htert line (if you click on the
               | individual chromosomes at your link above, you can scroll
               | down to the "/isolate" feature to see what material the
               | sequence was derived from). There's a long tradition of
               | having a reference assembly be a combination of different
               | individuals. Even the "standard" GRCh38/hg38 reference
               | doesn't represent any single individual.
        
         | jyxent wrote:
         | These are both the same paper. The earlier link points to the
         | preprint. The paper has now been published in Science.
        
           | yread wrote:
           | There is still room for improvement though, parts of the Y
           | chromosome are still not there.
           | 
           | Is it gonna be GRCh39 or will they change the naming scheme
           | again?
        
             | mmmrtl wrote:
             | > [We] have decided to indefinitely postpone our next
             | coordinate-changing update (GRCh39) while we evaluate new
             | models and sequence content for the human reference
             | assembly currently in development.
             | 
             | https://www.ncbi.nlm.nih.gov/grc
             | 
             | The complete Y chromosome from HG002 was added with v2
             | (after the paper was written). Probably a patched form of
             | GRCh38 will be made using T2T sequence, but IMO it makes
             | more sense to use T2T-CHM13 as a reference with its single
             | origin instead of a weird chimera, at least until pan-
             | genome graph methods mature.
        
           | cato_the_elder wrote:
           | Touche. There's no shortage of these headlines though, here's
           | one from 2003:
           | https://www.nytimes.com/2003/04/15/science/once-again-
           | scient...
        
             | rootusrootus wrote:
             | To be fair, they address exactly that point in the first
             | few sentences.
        
               | MathCodeLove wrote:
               | I don't think the content which addresses the title is
               | OP's point, just the fact that there have been _numerous_
               | publications with this title or something similar.
        
               | epistasis wrote:
               | I'm not sure what the suggestion is here. Scientists
               | should really work harder at their jobs and simplify the
               | real world to the point that a headline that somebody
               | absent mindedly read a decade ago don't sound repetitive?
        
               | elliotec wrote:
               | I think the suggestion is that we should use the word
               | "complete" when we mean it. Presumably, unlike say a
               | software project, there is an actual state of completion
               | possible in sequencing the human genome. Why has that
               | mark been supposedly met so many times over the past
               | couple of decades, only to be called complete again a few
               | years later? When is it actually complete? Does it even
               | matter anymore?
        
               | [deleted]
        
       | bendbro wrote:
       | It is unfortunate we are still misdirecting funds to fruitless
       | endeavors like genetics. As we know, genetics has little
       | influence on your individual biology or behavior, as race is a
       | social construct and each human shares 99.9% of their DNA with
       | each other. Further, hegemonic tools of assigning assumed traits
       | to people, like gender and IQ, are also social constructs, so any
       | connection they have to genetics is moot. If we are truly
       | interested in having a diverse, equitable understanding of
       | people, we should instead invest in efforts that actually seek to
       | understand them as people: therapy, rehabilitation, and
       | decolonization work.
       | 
       | https://www.discovermagazine.com/planet-earth/race-is-real-b...
       | 
       | https://en.m.wikipedia.org/wiki/Race_and_genetics#Race_and_h...
       | 
       | https://www.independent.co.uk/news/science/iq-tests-are-fund...
        
         | ineedasername wrote:
         | If you do a few basic searches on the applications of this
         | knowledge then you will see that the vast majority of research
         | & benefits that have built off of it have nothing to do with
         | anything you mentioned here.
         | 
         | Information about the benefits of genomic research are
         | trivially easy to find. To get you started in general, checkout
         | [1] below. For one of the most prominent examples-- the way it
         | fundamentally transformed cancer research, checkout [2].
         | 
         | [1]
         | https://www.google.com/search?q=genomic+research+application...
         | 
         | [2] https://www.icr.ac.uk/news-features/latest-features/how-
         | the-...
        
       | WallyFunk wrote:
       | > The Human Genome Project essentially handed us the keys to
       | euchromatin, the majority of the human genome, which is rich in
       | genes, loosely packaged, and busy making RNA
       | 
       | > Jarvis and Formenti hope that their contribution will not only
       | help tie a bow on the Human Genome Project, but also inform
       | research into diseases linked to the heterochromatic genome--
       | chief among them cancer
       | 
       | So the TL;DR or ELI5 version of this is this completion can help
       | fight cancer. Had to wade through this article to get as to _why_
       | we would want a complete sequencing. Any other non-obvious things
       | we can do after this? Like perhaps life extension or other
       | diseases we can cure?
        
         | jahewson wrote:
         | Cancerous cells can have a rapidly changing genome, with the
         | heterochromatin "glue" between genes playing an important role
         | in this, including altering how much those genes get expressed.
         | 
         | This work has sequenced the "glue" so we know what's supposed
         | to be there and can better understand what's different about
         | cancerous cells beyond the usual gene mutations.
        
         | mmmrtl wrote:
         | Literally any genetic disease (or shortcomings like aging)
         | could have missing facets hidden in these newly-complete
         | regions of the genome. It's kind of the same reason you would
         | want a complete anything, it's not ideal to go hunting for
         | knowledge while blinded to a nonrandom 8% of the territory.
        
         | ncmncm wrote:
         | "Cancer" is the gimme-funding word. It is quite doubtful that
         | this work will enable much better cancer treatment than what we
         | already had.
         | 
         | But it's science. Nobody knows what might come out, which is
         | really the point.
        
       | mint2 wrote:
       | I'm bothered by the description of the history of "junk" dna.
       | Going by this article dna, researchers labeled it junk just
       | because they couldn't analyze it well and prioritized the easier
       | 92% and thus didn't understand it. Calling it junk just seems
       | like trying to compensate for not understanding it like "I don't
       | understand it but that's fine because it's junk anyway"
       | 
       | And the scientist quote seems so wrong. if missing almost 10% of
       | something when that ~10% is not like the other 90% then it seems
       | like a very bad assumption to assume that it doesn't show a lot
       | of important features.
       | 
       | The quote: " You would think that, with 92 percent of the genome
       | completed long ago, another eight percent wouldn't contribute
       | much"
        
         | [deleted]
        
         | fabian2k wrote:
         | There is a very clear difference between "junk" DNA and "non-
         | junk" DNA, the latter encodes proteins. That doesn't mean that
         | the DNA parts that don't encode proteins are junk, this is more
         | of an exaggeration or misunderstanding that is often repeated,
         | but not what scientists thought.
         | 
         | There are clearly parts of DNA that are not essential, this is
         | clear if you compare genome sizes between different organism.
         | They can vary enormously, and not in a way correlated with any
         | complexity of the organism. There are also parts of DNA that
         | are remnants of viruses inserting many copies of their DNA,
         | which are the parts that could be considered junk. Even those
         | might have an effect simply due to their presence, but
         | essentially everything in the cell has some effects if you look
         | closely enough.
        
           | panabee wrote:
           | or the viral DNA might have an effect if demethylated.
        
         | Vladimof wrote:
        
         | shadowgovt wrote:
         | It would have been wise to declare it "the unknown regions" or
         | "the frontier," I think. Something to more closely indicate
         | that our gap of understanding was wider than indicated by the
         | moniker the project chose.
         | 
         | And I also agree with the bad assumption on the remaining 8%
         | not being significant when we knew that it was structurally
         | different. Less than 8% of an ELF is the header, but boy howdy
         | will that thing not run well if you cut it off.
        
           | TheJoeMan wrote:
           | Calling the "junk DNA" a "header" is closer to what it may
           | do, but still slightly different because in computer software
           | a header is still read with the same codec (bits) as the
           | data, much like DNA is usually read with the protein codec.
           | 
           | Instead, we are learning the "unknown" DNA performs
           | biological functions due to its physical nature, such as
           | physically blocking things from binding.
           | 
           | Imagine if a small section of a hard drive was so strongly
           | magnetized that it repulsed the read head - if you were
           | trying to translate it into binary it would appear to be
           | nonsense.
        
         | oofbey wrote:
         | The term "junk DNA" was coined very early in our understanding
         | of DNA. Even when we had no idea what it was for, very few
         | respectable geneticists actually believed it was "junk" - basic
         | evolutionary theory argues pretty strongly against it. But the
         | name has stuck around for far longer than it deserves to.
        
           | ma2t wrote:
           | Basic evolutionary theory may argue that most of it is "junk"
           | in the sense of being non-functional (even though some may be
           | species-specific or under selection too weak or recent to be
           | detectable). One paper that lays out this argument has title
           | with the memorable beginning "On the Immortality of
           | Television Sets."
           | https://academic.oup.com/gbe/article/5/3/578/583411
        
           | mateo1 wrote:
           | >basic evolutionary theory argues pretty strongly against it.
           | 
           | That's not true. Over millenia leftover chunks of DNA can
           | accumulate for no good reason. Duplication mistakes, viral
           | infections etc. The term junk dna originated from the initial
           | assumption that all noncoding dna was useless. Evolutionary
           | theory has nothing to do with this.
        
             | fabian2k wrote:
             | There is a huge difference between "large parts of it are
             | useless" and "all of it are useless". And large parts of
             | the non-coding DNA are probably useless, unless you're
             | extremely generous with what counts as "function" when
             | examining this.
        
           | [deleted]
        
         | axg11 wrote:
         | Context: I have a PhD in genomics
         | 
         | The label "junk DNA" was one of the biggest mistakes in the
         | history of genetics. A lot of high school textbooks still
         | reference this term and it's worse than misleading.
         | 
         | In many ways, non-coding DNA is just as important as the parts
         | of the genome that code for proteins. Non-coding DNA determines
         | expression levels, genome confirmation (shape), and replication
         | efficiency among other things.
         | 
         | The term junk DNA misleads students into thinking that these
         | sections of DNA play little part in how a cell functions. Quite
         | the opposite, the "junk DNA" is responsible for orchestrating
         | the "non-junk" bits.
        
           | cookiengineer wrote:
           | Maybe you can answer this: what happened to the new bases
           | that papers appeared about around 2011 where the new bases
           | were added to the encoding scheme, using 8 instead of 4/6?
           | [1]
           | 
           | Isn't DNA represented by TCGA then not "junk" either way if
           | they're using the wrong classifications for the bases?
           | 
           | [1] https://www.science.org/doi/10.1126/science.1210597
        
           | alfiedotwtf wrote:
           | Question from the peanut gallery: if you were to flip a
           | single bit in this junk DNA, are the outcomes only slightly
           | different or could they be wildly variable depending on which
           | bit was flipped?
        
         | hetspookjee wrote:
         | Ha, they might as well call it high yield DNA as in other
         | nomenclature.
        
         | ncmncm wrote:
         | The usual expression nowadays is "non-coding DNA".
         | 
         | Undoubtedly much of it could be pruned out with no undesirable
         | result, but there does not seem to be any ongoing process to do
         | that, so stuff piles up. As it will.
        
           | [deleted]
        
           | sockpuppet_12 wrote:
           | >Undoubtedly much of it could be pruned out with no
           | undesirable result
           | 
           | Such hubris as this is what led us to:
           | 
           | - define DNA we didn't and still don't understand as useless
           | "junk"
           | 
           | - call the appendix a useless vestigial organ
           | 
           | - declared "silenced" b-cells useless
           | 
           | The list goes on and on and on... When will somebody compile
           | a list of how often science is wrong just to slap the
           | arrogance out of people before they cost more time and lives
           | with such reckless and impatient reasoning?
        
             | ncmncm wrote:
             | There is a very large difference between "much of" and
             | "all". And we have at this time no way to distinguish which
             | bits are in the "much of" and which the rest.
             | 
             | There is so very much of it that even were the actually-
             | junk just 5% of that, it would still qualify as "much of".
        
             | epgui wrote:
             | This perspective is true to some extent, but it's counter-
             | productive to think of science as being wrong.
             | 
             | You should think of science as the "least wrong" set of
             | beliefs we have at any point in time. It will never be
             | perfectly right, and every day it's less and less wrong.
             | The reason it's so reliable is because it embraces (and
             | doesn't dismiss) this uncertainty.
        
               | post_below wrote:
               | I don't think it's counter productive at all. There was a
               | period of time, when the appendix was (absurdly)
               | considered vestigial, that surgeons would remove the
               | appendix as a side quest if they happened to have the
               | area opened for some other purpose.
               | 
               | That was a terrible idea, but one that was supported by
               | science at the time. There are practical reasons to be
               | skeptical about scientific assumptions.
               | 
               | Science becomes less wrong faster if we allow history to
               | remind us that a lot of what we believe will likely turn
               | out to be wrong.
        
               | ncmncm wrote:
               | A better example would be irradiating thymus glands.
               | 
               | Appendices are still removed, to this day, and people
               | lacking them make do without. A thymus gland is harder to
               | dispense with.
        
         | aaaronic wrote:
         | Junk DNA is, AFAIK, not actively expressed (used to create
         | proteins). It's important, though, in the sense that spacing
         | between gene expression sites is a control on which genes get
         | expressed under which conditions (so the junk adds necessary
         | spacing between important genes).
         | 
         | I did _some_ research on epigenetics during my MS degree.
         | Spacing between sites was an important factor in our modeling
         | of gene expression.
        
           | chaxor wrote:
           | What research have you seen on modeling gene expression? I'm
           | genuinely curious, as I haven't really seen many convincing
           | _ab initio_ studies towards this. I could see finding certain
           | features like this spacing as predictive of perhaps some
           | other feature, but I haven 't seen any research that really
           | tackles generation of gene expression data from first
           | principles and input of DNA sequence. It's my understanding
           | that modeling the kinetics is difficult, as we really haven't
           | tried making the full network of differential equations. Does
           | anyone have a project that points to the 'final solution' to
           | this? I know recently there was a paper in cell that modeled
           | the cell with the smallest viable genome to predict cell
           | division, but that's a bit further away from complete
           | modeling of our 30k genes' (much less isoforms) dynamics.
        
             | jashephe wrote:
             | Global models of gene expression for an entire cell are
             | fairly distant at this point, but there is quite a bit of
             | work into modeling transcriptional activity from sequence.
             | If you're interested in reading more, a relevant technology
             | to search for would be the "Massively Parallel Reporter
             | Assay", or MPRA, which couples pools of 104-105+ synthetic
             | DNA sequences with RNA sequencing to measure
             | transcriptional output. Data from MPRA experiments is being
             | used to train models, although these models are not
             | anywhere near a point where you could model the gene
             | expression of all regulatory elements in a cell; they are
             | usually focused on a specific factor or regulatory
             | sequence.
        
           | kkylin wrote:
           | Not just spacing. The sequence also matters as they serve as
           | binding sites for enzymes that can promote or repress the
           | expression of downstream genes. As just one (relative simple)
           | example of how complicated genetic circuitry can be, I really
           | enjoyed & recommend Mark Ptashne's _A Genetic Switch: Phage
           | Lambda_ for anyone who doesn 't mind doing some slightly
           | technical reading.
           | 
           | Disclaimer: not a biologist, and would be interested in
           | hearing from someone more knowledgeble than I, both about the
           | Ptashne book and about recommended reading.
        
             | toper-centage wrote:
             | Do junk DNA is like code styling and comments in
             | programming.
        
               | meowkit wrote:
               | Its closer to a config file / internal functions that
               | modify the state variables of a system instead of
               | generating objects. The junk DNA doesn't explicitly get
               | read, but it interacts in nonlinear ways with the
               | executable "text" portion of the DNA.
               | 
               | Also disclaimer: My only knowledge of this is from Nessa
               | Carey's The Epigenetics Revolution and some additional
               | online reading.
        
           | ShroudedNight wrote:
           | > spacing between gene expression sites is a control on which
           | genes get expressed under which conditions
           | 
           | This makes it sound like it represents control flow rather
           | than data. If its presence does / can make a material
           | difference on the output encoding, it strikes my non-expert
           | ears as actively perilous to label such DNA 'junk'
        
             | amne wrote:
             | so DNA is written in Python then. it's settled.
        
               | sterlind wrote:
               | and pythons are written in DNA. a fully bootstrapped
               | system!
        
               | tazjin wrote:
               | * self-hosted system
               | 
               | We still don't know how hard bootstrapping it would be :)
        
           | grishka wrote:
           | I remember reading how some of the "junk" DNA turned out to
           | be important because while it doesn't make proteins, the
           | "non-coding" RNA it gets transcribed into regulates
           | something.
        
         | dudeinjapan wrote:
         | Didn't you watch the move Twins? Arnold got all the good DNA
         | and Danny DeVito got the junk DNA.
        
           | Lammy wrote:
           | Or play Metal Gear Solid https://metalgearsaladblog.wordpress
           | .com/2016/10/28/liquid-s...
        
         | tehchromic wrote:
         | I think it's more that the geneticists have the sense of humor.
        
         | swayvil wrote:
         | It's a common trope. Anything that cannot be intellectually
         | digested is labeled "junk", ignored and, eventually, becomes
         | invisible.
         | 
         | You would be astonished at how much of reality falls into that
         | category.
        
       | jas- wrote:
        
       | gerdusvz wrote:
       | now at long last we can be... better
        
       | [deleted]
        
       | lordnacho wrote:
       | Maybe someone can explain what exactly it means. Are all the
       | variants of every allele now mapped? Of course everyone might
       | have a slightly different variant, so what does it mean?
       | 
       | What does complete mean?
        
         | dekhn wrote:
         | No, not all variants of every allele are now mapped. You would
         | have to sequence a significant fraction of the human
         | population, and imho the very idea of mapping all the variants
         | of alleles doesn't quite square with what would be the most
         | useful way to understand human genotype to phenotype variation.
        
         | 323 wrote:
         | Imagine a sequence in the DNA like this:
         | TAAAAAAAAAAAACAAAAAAAAAAG. The way sequencing worked is the DNA
         | is split into small parts and then they are aligned back
         | together. But if we split the sequence above we might get these
         | pieces: AAAAAA AAAAAA AAACAA AAAAAG TAAAAA. You know the first
         | and last letter of each piece overlaps, but due to the high
         | repetition count of A there is no way to figure out what the
         | proper order is.
         | 
         | With new techniques you generate much longer pieces, so there
         | is much less confusion.
        
         | awenger wrote:
         | Complete here means the full end-to-end sequence of all
         | chromosomes in a single human cell line named CHM13. The
         | typical human cell has 46 chromosomes, in 23 pairs (one from
         | our mother, one from our father) named chromosome 1, chromosome
         | 2, and so on. This CHM13 cell line is special is that each of
         | its pairs is (nearly) identical. Each chromosome is a long
         | string of A,C,G,T nucleotides. So, this complete genome is a
         | full set of 23 sequences without any "not sure" positions or
         | "gaps" in the sequence.
         | 
         | One common analogy is to consider the genome sequence (a.k.a.
         | assembly) as a map. Since the initial publication of the human
         | genome in the early 2000s, most regions of human DNA has been
         | known in full resolution. Other portions, most prominently the
         | repetitive centromeres that lie at the middle of chromosomes,
         | have remained unmapped. It was known that they exist,
         | approximately how big they were, and which types of sequences
         | lay inside, but the full order of the sequence had never been
         | determined for any human genome until this work.
         | 
         | You could consider the genome like the earth and the
         | centromeres like a dense rainforest. Previously we had detailed
         | maps of most of the earth, and we had mapped the boundaries of
         | the rainforest and had satellite-level images (i.e. we knew
         | they were full of plants). Now we have on-the-ground pictures
         | with full detail.
         | 
         | Having a map of these sequences makes the accessible to study.
         | One of the most valuable uses of the human genome is as a
         | shared coordinate system used by scientists to compare
         | different individuals and identify and name genetic variants
         | that explain human traits. We lacked that coordinate system for
         | a big chunk of the genome until now.
         | 
         | As you say, this paper reports the sequence of a single human
         | cell line named CHM13. Each of us has a slightly different
         | genome sequence (really two of them, one from each parent). Now
         | when scientists sequence the genomes of more individuals, they
         | can look at these regions that were previously ignored.
         | Certainly understanding those regions will improve our
         | understanding of human biology. Exactly how much will remain to
         | be seen.
        
           | shpx wrote:
           | > The typical human cell has 46 chromosomes, in 23 pairs
           | 
           | Mitochondria have their own DNA, which is also sequenced.
        
           | Eduard wrote:
           | What's a cell line, and do we know anything about who CHM13
           | is?
        
             | tonto wrote:
             | chm13 is from a "complete hydatidiform mole"
             | https://en.wikipedia.org/wiki/Molar_pregnancy and the paper
             | says "Local ancestry analysis shows that most of the CHM13
             | genome is of European origin, including regions of
             | Neanderthal introgression, with some predicted admixture"
             | and fig 1 shows a cool breakdown of the regions of the
             | genome with different ancestries
        
             | sapsan wrote:
             | Seems to be an immortalized (telomerase*-transformed) cell
             | line from a female fetus with near-complete homozygosity (h
             | ttps://sites.google.com/ucsc.edu/t2tworkinggroup/chm13-cell
             | ...).
             | 
             | * Telomerase is a reverse transcriptase that allows to
             | achieve replicative immortality
             | (https://academic.oup.com/hmg/article/9/3/403/715108).
        
         | busyant wrote:
         | From the Science article:
         | 
         |  _" However, limitations of BAC cloning led to an
         | underrepresentation of repetitive sequences, and the
         | opportunistic assembly of BACs derived from multiple
         | individuals resulted in a mosaic of haplotypes. As a result,
         | several GRC assembly gaps are unsolvable because of
         | incompatible structural polymorphisms on their flanks, and many
         | other repetitive and polymorphic regions were left unfinished
         | or incorrectly assembled (5)."_
         | 
         | Looks like there were "gaps" in the sequence due to technical
         | limitations associated with the original sequencing methods and
         | the authors have filled in those gaps. I haven't read the full
         | paper, though.
        
       | vaylian wrote:
       | > CHM13 lacks a Y chromosome, and homozygous Y-bearing CHMs are
       | nonviable, so a different sample type will be required to
       | complete this last remaining chromosome.
       | 
       | (from the paper itself)
       | 
       | It is a respectable achievement. But the Y chromosome is too
       | important to be left out in order to call this the complete human
       | genome.
        
         | tonto wrote:
         | y-chromosome was added since the preprint was made
         | https://twitter.com/aphillippy/status/1509594880623796226 and
         | was made from the hg002 cell sample (which is heavily analyzed
         | by the genome in a bottle project
         | https://www.nist.gov/programs-projects/genome-bottle)
        
         | [deleted]
        
       | jghn wrote:
       | Now on to the more important challenge. Making our understanding
       | of the human genome more diverse and less specific to certain
       | geographic areas. This is already having an impact in studies,
       | drug development, etc based on genomics.
       | 
       | Investing in organizations such as H3Africa will be important.
        
         | jahewson wrote:
         | I don't think it's "more important", without a reference genome
         | it's impossible to take the next step and the Human Genome
         | Project successfully took us from 0 to 1. Going from 1 to n is
         | much easier. The Human Pangenome Project is working on this and
         | should have 350 diverse genomes sequenced within the next
         | couple of years.
         | 
         | Note that this has nothing to do with collecting variations in
         | individual genes - that's easy and widely available. But about
         | collecting variations in the actual content and structure of
         | the genome. e.g. Some populations have a bunch of extra DNA
         | that most other humans lack, amazing.
        
       | yoyopa wrote:
       | when can i grow a second a set of arms?
        
         | ineedasername wrote:
         | If you're in the US you have the Constitutional right to bear's
         | arms. I'd choose grisly bear, or maybe panda.
        
         | akira2501 wrote:
         | That's just the Homeobox[1] genes. They're actually incredibly
         | simple given their complex function.
         | 
         | [1]: https://en.wikipedia.org/wiki/Homeobox
        
       | tyjen wrote:
       | Eh, it's referring to base pair variations. The title is on the
       | sensationalistic side when you consider how most lay people will
       | interpret it.
       | 
       | The cool stuff people imagine about in response to the title
       | won't happen until researchers finish figuring out regulatory
       | regions in the DNA; and, how DNA interacts with itself and
       | environment, both spatially and temporally. Regulatory regions
       | are promoters, enhancers, silencers, and insulators, and impact
       | gene expression and regulation.
        
         | ak217 wrote:
         | > it's referring to base pair variations
         | 
         | No.
         | 
         | > The title is on the sensationalistic side when you consider
         | how most lay people will interpret it.
         | 
         | The title refers to a large scientific collaboration that has
         | succeeded in utilizing single-molecule sequencing technology
         | that only matured in the last 3-5 years to sequence regions of
         | the human genome that were previously unmapped, bringing the
         | completeness of the mapping to 100%. That doesn't seem
         | sensationalistic.
        
       | paulf5678 wrote:
        
         | dymk wrote:
        
       | xbar wrote:
       | Is it free?
        
         | smoldesu wrote:
         | I think most people ship with a copy from birth.
        
           | bqmjjx0kac wrote:
           | Can't wait for v2 to ship. Maybe it will have drivers for the
           | latent psychic hardware.
        
             | smegger001 wrote:
             | I would settle for a plugin api and a descent man file
        
             | mmmrtl wrote:
             | https://github.com/marbl/CHM13/commit/85644b74e188aa2124943
             | b...
             | 
             | v2, now with a Y chromosome
        
           | paskozdilar wrote:
           | Yeah, but it's in compiled form. Average person does not
           | possess tools or skills to read the code and see what it does
           | or modify it's behavior.
           | 
           | We need to stop using proprietary genomes. Free genomes, free
           | society.
        
             | brimble wrote:
             | I'm pretty sure there a bunch of very popular, high-traffic
             | sites with tons of content that demonstrates the build
             | process.
        
               | dotancohen wrote:
               | This guy's interested in reading the make file. Those
               | sites just show people running make.
        
               | brimble wrote:
               | There are sites to cater to those who prefer reading
               | about the build process, too. Uh, so I hear.
        
         | wonderwonder wrote:
         | 2013 Supreme court case Molecular Pathology v. Myriad Genetics,
         | Inc says yes. Although I would guess the right to read the
         | results of the study are not necessarily free.
        
         | ece wrote:
         | https://www.ncbi.nlm.nih.gov/assembly/GCA_009914755.4#/def
        
         | awenger wrote:
         | The data from the project is released to the public domain
         | (CC0). The research article is also free to access.
         | 
         | See https://github.com/marbl/CHM13 and
         | https://www.science.org/doi/10.1126/science.abj6987.
        
       ___________________________________________________________________
       (page generated 2022-04-02 23:00 UTC)