[HN Gopher] A DNA 'parasite' may have fragmented our genes
       ___________________________________________________________________
        
       A DNA 'parasite' may have fragmented our genes
        
       Author : theafh
       Score  : 99 points
       Date   : 2023-03-30 15:32 UTC (7 hours ago)
        
 (HTM) web link (www.quantamagazine.org)
 (TXT) w3m dump (www.quantamagazine.org)
        
       | dboreham wrote:
       | Sometimes little frustrating that people writing articles in this
       | field seem to have no exposure to computer science.
       | 
       | This is the gene equivalent of a filesystem: the DNA fragments
       | are like disk blocks. The interposing sections are file metadata.
       | The shearing mechanism is the filesystem reconstructing a stream
       | from the lower layer blocks. There's probably some redundancy and
       | error correction in there too.
       | 
       | It needs a filesystem for the same reason they were invented for
       | computers: to provide an impedance match between the upper layer
       | semantics (a stream of pairs describing a protein) and the lower
       | layer storage (blocks). Using block structured storage is more
       | flexible in terms of being able to insert in the middle of a
       | file, etc etc.
        
         | sorokod wrote:
         | _It needs a filesystem for the same reason they were invented
         | for computer_
         | 
         | Why would that be true? It is a cool analogy but implying that
         | it is the reason requires much more.
        
           | renewiltord wrote:
           | Without taking a stance on whether it's right, I believe that
           | is explained right after your quote cut-off.
        
             | sorokod wrote:
             | That the biology of processing DNA is subject to similar
             | constraints and solutions computer have dealing with disk
             | storage is remarkable.
             | 
             | Justification is needed.
        
               | bronson wrote:
               | At a coarse level, both are affected by information
               | theory, so some parts may look vaguely similar. Sure,
               | it's plausible you'd find related solutions, especially
               | if you really squint.
               | 
               | But it's like trying to explain how atoms work using a
               | solar system analogy. It might help with the really easy
               | stuff, maybe? (orbitals) But sticking with it makes going
               | any deeper pretty confusing.
        
         | [deleted]
        
         | inciampati wrote:
         | I don't like appeals to authority, but as a computational
         | biologist, I am both a computer scientist and a biological
         | scientist. So understand that I'm responding to you as exactly
         | the kind of person you think should be drawing the kinds of
         | links that you're suggesting.
         | 
         | I'm sorry, but this is just not a reasonable analogy. DNA
         | sequences are not like computer files. The reason that they're
         | distinct modules in these sequences is due to the need for
         | evolution to be feasible. And also for the basic reason that
         | the sequences are linear and so modules tend to appear in these
         | linear sequences. But there also tremendous nonlinearities as
         | well. Things at very long distances can be importantly related
         | to each other.
         | 
         | The introns are not metadata. There are regions that can be
         | removed selectively and in combinations to cause diversity in
         | the produced proteins. That diversity is advantageous because
         | it allows a single DNA sequence to present many different
         | proteins that are typically related, but can be very different
         | structurally. This splicing capability has evolved apparently
         | from entities that can be seen as endogenous viruses or DNA
         | parasites that have the ability to insert and splice themselves
         | out of DNA and RNA sequences. In many confusing words, that's
         | what the article or pointing at is talking about.
         | 
         | The introns do provide a kind of redundancy, but only in the
         | sense that there are areas that can be modified with minimal
         | effect on cellular function, at least relative to modification
         | of the exons, which directly correspond in a one to one way
         | with proteins.
         | 
         | There is error correction. It's called homologous chromosomes.
         | Everyone talks about there being one genome, but in most
         | complex forms of life, you have more than one copy per cell,
         | usually two, and often more. These multiple copies, in addition
         | to allowing for recombination and sexual reproduction, provide
         | templates on which errors which arise during life can be
         | corrected. However, there are no error correcting codes.
         | 
         | If you'd like to learn more about the actual details of these
         | systems, I strongly suggest an undergraduate molecular biology
         | textbook. The best one in existence is called the Molecular
         | Biology of the Cell.
         | 
         | There are indeed many similarities between computing systems
         | and biological systems, but the analogies you are making don't
         | appear to be clear. Read a book like this, deeply and slowly,
         | and it might change your life. At very least, it'll mean that
         | the world you live in is much less mysterious and much more
         | exciting.
        
           | notfed wrote:
           | "Molecular Biology of the Cell"
           | 
           | Shout out to this book. Not only has the most amazing
           | imagery, but you can learn a lot just by skimming the book,
           | because of how well organized it is.
        
           | pazimzadeh wrote:
           | Aren't introns like pieces of code that have been commented
           | out?
           | 
           | But they also change the 3D conformation of the DNA itself,
           | which changes access by transcription factors, etc.
        
             | gus_massa wrote:
             | It's more like someone during the night cut your magnetic
             | tape in the middle of a txt file tape and glued a picture
             | of a cat between the two parts. The picture of the cat has
             | some special code in the extremes, so it automatically
             | disappear when you open the txt file.
             | 
             | There are some weird case, where the same txt file has two
             | cat pictures, and sometimes instead of removing the two
             | cats, the system removes also the texts between the cats.
        
         | gus_massa wrote:
         | You look interested in the subject, but I recommend to read a
         | few biology books about it. There are many weird low level
         | features of DNA that are not so cover in popular discussions
         | [1] [2]. But I don't remember any that is similar to a
         | filesystem as you propose. Take a look, you will be gladly
         | surprised.
         | 
         | [1] One of my favorites is that the bases of ADN are translated
         | in groups of 3 to amino acids, so the code reads like
         | AAABBBCCCDDDEEEFFF
         | 
         | It's very unusual, but there are some virus that read the same
         | part in two ways, with different offset, so the same part is
         | interpreted as                 -JJJKKKLLLMMMNNN--
         | 
         | I don't remember if they use the other offset too
         | --PPPQQQRRRSSSTTT-
         | 
         | [2] Another, not so interesting but relevant. Eukaryote has
         | linear DNA, so they have some special repetition in the
         | extremes. The idea is that the extremes are difficult to copy
         | by the usual enzyme that copy the main part that has assorted
         | code. But the extremes have a special easy pattern, so the cell
         | can use some specialized enzyme to make them longer.
        
         | AllegedAlec wrote:
         | https://xkcd.com/793/
        
         | bashinator wrote:
         | Nope, at best the filesystem is an analogy. Just like the
         | stretchy rubber mat isn't a perfect description of spacetime.
        
         | rco8786 wrote:
         | That's a weird thing to get frustrated about...that someone in
         | a completely unrelated field didn't have the experience or
         | courtesy to explain something using analogous terms to the
         | thing you happen to be an expert on.
        
           | vkou wrote:
           | Doubly so, when the two domains don't actually map cleanly to
           | eachother. DNA is not a computer program, or a file, or
           | storage. There's no real distinction between data, metadata,
           | and 'code' in it, either in structure, or in practice.
        
           | aeonik wrote:
           | I don't think they are unrelated fields. Computer science is
           | the study of computation. DNA, to me, is clearly a quaternary
           | computation system.
           | 
           | I think there is a lot for both fields to learn by studying
           | knowledge from each. Bioinformatics seems to be on that
           | track.
        
             | rco8786 wrote:
             | Cool so what's the last thing you described to your team
             | using terms from DNA research?
        
               | spullara wrote:
               | genetic optimization algorithms?
        
             | anonymouskimmer wrote:
             | We've got DNA which is basically a storage system. RNA can
             | be catalytically active on its own. Typically RNA and
             | proteins, or complexes of such, act on DNA in various
             | manners.
             | 
             | Maybe you're right in some way, but also consider whether
             | using the nomenclature and ideas used to described
             | processes of DNA repair, transcription, and translation in
             | biology to describe electronic computation works well. If
             | it does work well, then maybe the reverse would also work
             | well. If it leaves much to be desired, then consider the
             | possibility that computer science ideas may be too specific
             | to electronic or mechanical computation.
        
         | kleer001 wrote:
         | https://en.wikipedia.org/wiki/Curse_of_knowledge
        
         | otherme123 wrote:
         | There are organisms with almost no introns, no redundancy, no
         | CRC, no "metadata" and even overlapping genes to save space,
         | like Giardia genome (
         | https://www.science.org/doi/10.1126/science.1143837 ). Lots of
         | virus have all their genes encoded without introns, and almost
         | all the genome is encoding something.
         | 
         | I've never seen DNA as close to a filesystem, and our current
         | best bet on introns functions are they are used to create
         | alternative splicing products from the same DNA chunk. I cannot
         | identify this function in a filesystem, where you can obtain
         | two or three different _valid_ files from the same data just by
         | skipping some blocks.
        
           | pazimzadeh wrote:
           | What is CRC?
           | 
           | Metadata is everywhere... histone modifications,
           | glycosylation, etc..
        
             | anonymouskimmer wrote:
             | Cyclic redundancy check? And there is almost always basic
             | redundancy in non-viral organisms (and many viruses) in
             | that DNA is typically double-stranded, and most organisms
             | have repair machinery that can rewrite across single-strand
             | lesions using the opposite strand (this fails at double-
             | strand lesions).
        
           | GauntletWizard wrote:
           | I can - It's called COW Snapshotting. Modern filesystems like
           | ZFS and BTRFS don't ever overwrite parts of the file that
           | change. They abstract it away by keeping an ordered list of
           | blocks. Snapshots are simply copies of the old list.
           | 
           | The analogy doesn't go very far, however.
        
             | bronson wrote:
             | COW just dedupes, it doesn't produce alternatives. Maybe a
             | filesystem that figures out how your spreadsheet can be
             | stored partway into an executable, with no loss to either?
             | Yeah, this analogy doesn't seem real helpful.
        
         | monocasa wrote:
         | I don't think that's a great comparison. Exons already have
         | sequences that mark 'block' boundaries; amino acids are encoded
         | in triplets of base pairs, but sort of like you see in 8b/10b
         | encoding, there are sequences that are valid but only used for
         | control purposes and don't correspond to amino acids.
        
         | afavour wrote:
         | I don't want to sound dense here but why is it frustrating that
         | people who write about genetics aren't familiar with computer
         | science?
        
           | agumonkey wrote:
           | I can have similar thoughts at times. People in one field
           | have their own lens to see the world and might miss some
           | structures / patterns that exist in other domains. I felt it
           | was a bit pompous to read a few medical books about the
           | cardiovascular system, a lot of ceremony to describe an
           | organic pump. You'd have to read mathematically inclined
           | papers to start reading about equations and principles rather
           | than latin nomenclature. Which I think is what the
           | grandparent was wishing for.
           | 
           | ps: I absolutely do not put computing above other fields
           | though. I just wish for some pragmatic polymathism sometimes.
        
         | anonymouskimmer wrote:
         | > It needs a filesystem for the same reason they were invented
         | for computers: to provide an impedance match between the upper
         | layer semantics (a stream of pairs describing a protein) and
         | the lower layer storage (blocks). Using block structured
         | storage is more flexible in terms of being able to insert in
         | the middle of a file, etc etc.
         | 
         | As a non-CS person I find this explanation opaque.
        
       | mmmrtl wrote:
       | Misleading title ("Their" Genes). I don't see what introners have
       | to do with the human genome?? They found evidence for introners
       | in 5% of species...
       | 
       | Original paper: https://www.pnas.org/doi/10.1073/pnas.2209766119
        
       | [deleted]
        
       | fnordpiglet wrote:
       | I for one welcome our new introner overlords.
        
       | masswerk wrote:
       | "spliceosomes" - I'm somewhat disappointed. (Not really the true
       | greco-roman spirit.)
        
       | neoyagami wrote:
       | oh. a "descolada"
        
       | koeng wrote:
       | My favorite quote I saw or heard somewhere on the rogue genetic
       | elements in us all:
       | 
       | "We are but a raft of genes in an ocean of retrotransposons"
       | 
       | A little hyperbolic, but dang there are a lot
        
       | MagicMoonlight wrote:
       | That's sneaky. Ironically more like a computer virus than the
       | classical viruses. Taking over the host and modifying its boot
       | partition so that it permanently gets produced by the system.
        
         | akavi wrote:
         | I'd say more like a classical virus with a loop earlier in the
         | central dogma.
         | 
         | Ie, in the DNA => RNA => Protein cycle, viruses are a loop from
         | Protein => DNA or Protein => RNA. Introners are a loop from RNA
         | => DNA.
        
           | gus_massa wrote:
           | Virus are not Protein => DNA or Protein => RNA. Their
           | information is in DNA or RNA, so they are
           | https://en.wikipedia.org/wiki/Virus#Genome_replication
           | 
           | * DNA => RNA => Protein
           | 
           | * RNA => Protein
           | 
           | * RNA => DNA => RNA => Protein
           | 
           | As far as I know, there is no method to do Protein => DNA or
           | Protein => RNA at the celular level.
           | 
           | It would be very surprising. RNA and DNA are quite similar
           | and have similar encoding, so RNA <==> DNA is a 1 to 1
           | translation.
           | 
           | The translation to RNA to proteins is not 1 to 1, and the
           | translations table is quite arbitrary, so untranslating at
           | the celular level looks extremely difficult.
        
         | sobkas wrote:
         | > That's sneaky. Ironically more like a computer virus than the
         | classical viruses. Taking over the host and modifying its boot
         | partition so that it permanently gets produced by the system.
         | 
         | More like infecting compiler so every application build using
         | it will include virus, including building compilers that will
         | add virus code to their output and propagate it.
        
         | anonymouskimmer wrote:
         | There's basically no such thing as a boot partition. The
         | machine of life has been turned on since the beginning. At this
         | point, with all of the changes since the beginning, it's not
         | obvious that there's a "boot partition" left that could
         | reactivate life should it shut off. All cellular progeny is
         | made with already functional and switched on proteins and RNA.
         | 
         | The best you get to shutting off (without permanent cell death)
         | would be the computer equivalent of hibernation. All the
         | proteins and RNA are still there just waiting for the signal to
         | activate again.
        
           | jamiek88 wrote:
           | Crazy to think we are all here because of that first multi
           | cellular organism splitting over and over and over.
           | 
           | Life doesn't reboot as you say, it is split off from other
           | organisms whether seed, sperm, rhizome or any other method
           | it's just cells dividing and spitting off other cells.
           | 
           | Mind boggling to me.
        
             | anonymouskimmer wrote:
             | Yeah. I've known this for years but it really struck me
             | when I typed it out here.
        
             | __MatrixMan__ wrote:
             | The idea that we all branched from a single ancestral
             | organism has never sat well with me. Whatever started that
             | process, however improbable... Well the universe allowed it
             | to happen.
             | 
             | Why expect that the universe wouldn't subsequently continue
             | to let it happen, again and again?
        
               | anonymouskimmer wrote:
               | Sure, but it would be a completely different tree of
               | life, from a different origin.
               | 
               | It's possible our origin was from a community of
               | ancestral organisms, but at some point all terrestrial
               | life that we have discovered so far intermixed enough to
               | create an effective universal common ancestor that we all
               | appear to branch from.
        
               | Tagbert wrote:
               | Because, once it happens in an environment, there is no
               | more room for an alternate life form to arise. A new
               | instance of life would have to compete against the
               | established line and it is unlikely to survive that
               | process.
        
               | Izkata wrote:
               | My understanding is this is what happened during the
               | Cambrian explosion.
        
               | samus wrote:
               | Life arising many times in parallel ought to have given
               | rise to multiple trees of life with mutually incompatible
               | biochemistry. Yet, overall life speaks about the same
               | genetic language, and most things work very similar to
               | each other. Life could still have indeed arisen multiple
               | times, however, it probably either merged or got
               | supplanted by its competitors. Life is simply too
               | pervasive to allow for anything else. It would also
               | immediately out-compete any newly arising life.
               | 
               | There is some evidence that things like the genetic code,
               | the choice of RNA/DNA nucleotides, and the set of the 20
               | aminoacids aren't really random. That would not rule out
               | life arising multiple times, but the likelihood that it
               | merged with other lineages would be even higher.
               | 
               | Short summary: https://www.science.org/content/blog-
               | post/why-these-amino-ac... . A more in-depth paper: https
               | ://www.sciencedirect.com/science/article/abs/pii/S03781..
               | .
        
               | akiselev wrote:
               | There are a lot of microbes that are unculturable to this
               | day and to my knowledge, no one has really done a proper
               | investigation to see if the universal metabolic molecules
               | like, for example, the hydrogen carriers
               | NAD+/NADP+/NADPH, are truly universal. If we're going to
               | see evidence of multiple trees of life, it'd be in those
               | little details because most of the food chain has to
               | interact with each other. Or fungi and other decomposers
               | can bridge the gap.
               | 
               | I think over the span of billions of years, evolution
               | tends to converge too much for the trees to remain very
               | distinct from each other.
        
               | anonymouskimmer wrote:
               | > There are a lot of microbes that are unculturable to
               | this day
               | 
               | Some of this has been solved by literally allowing the
               | microbes to sit in culture for a year or so in order to
               | either wake up from hibernation, or adapt to the culture
               | composition.
               | 
               | > see if the universal metabolic molecules like, for
               | example, the hydrogen carriers NAD+/NADP+/NADPH, are
               | truly universal.
               | 
               | For anything that's based on DNA or RNA we now do direct
               | sequencing of environmental samples. From this direct
               | sequencing we can pull out individual genes and pathways.
               | 
               | > I think over the span of billions of years, evolution
               | tends to converge too much for the trees to remain very
               | distinct from each other.
               | 
               | We've got over 20 recognized genetic codes already from
               | existing life. These are highly similar, but this
               | probably points to similar origins instead of
               | convergence.
        
           | nobody9999 wrote:
           | >The best you get to shutting off (without permanent cell
           | death) would be the computer equivalent of hibernation. All
           | the proteins and RNA are still there just waiting for the
           | signal to activate again.
           | 
           | Fungal spores[0][1] come to mind.
           | 
           | [0] https://en.wikipedia.org/wiki/Spore#Fungi
           | 
           | [1] https://space.stackexchange.com/questions/37268/can-
           | mushroom...
        
         | marcosdumay wrote:
         | It was well known that a lot of our genome got inserted there
         | by virus. I think the news this article is reporting is that
         | the defense mechanism is the explanation for that weird
         | behavior.
        
       | stuckinhell wrote:
       | Biology is truly fascinating, the ultimate hardware/software
       | combo of proteins/genes.
       | 
       | It's likely parasites can alter our genetic expression and
       | behavior today as well like rabies and toxoplasmosis(cats often
       | have it). Rabies causing the fear of water is truly mind bending,
       | how does it do that?!
       | 
       | Toxoplasma infection is classically associated with the frequency
       | of schizophrenia, suicide attempts or "road rage".
       | https://pubmed.ncbi.nlm.nih.gov/31980266/#:~:text=Toxoplasma....
       | 
       | Rabies:As the disease progresses, the person may experience
       | delirium, abnormal behavior, hallucinations, hydrophobia (fear of
       | water), and insomnia.
       | https://www.cdc.gov/rabies/symptoms/index.html#:~:text=As%20....
        
         | whizzter wrote:
         | Even more interesting reverse of that, hairworms that infect
         | grasshoppers will once mature cause the hosts to jump into
         | water and drown where the worm then reproduces before starting
         | the cycle again.
        
           | hypertele-Xii wrote:
           | And cordyceps fungi compel ants to climb to a specific height
           | off ground, at millimeter and 95% accuracy, to a spot of
           | ideal location and humidity for the fungus to spore.
           | 
           | And the craziest thing is, the cordyceps fungus doesn't
           | actually infiltrate the ant's brain! Autopsies found the
           | fungus spreads all over the ant's body, but _not its brain!_
        
         | thaumasiotes wrote:
         | > Rabies causing the fear of water is truly mind bending, how
         | does it do that?!
         | 
         | It doesn't.
         | 
         | > Rabies:As the disease progresses, the person may experience
         | delirium, abnormal behavior, hallucinations, hydrophobia (fear
         | of water)
         | 
         | This is a weird mistake for the CDC to make. The etymological
         | meaning of "hydrophobia" is "fear of water". But the English
         | word is completely disconnected from that; it just means
         | "rabies". Because of this, the disambiguation page for
         | "Hydrophobia" on wikipedia links to rabies as well as to
         | "aquaphobia", an actual fear of water which had to be named
         | badly because the name "hydrophobia" was already taken.
         | 
         | Rabies was named "hydrophobia" because rabies patients will
         | generally refuse water when it's offered to them. They do that
         | because rabies makes it difficult to swallow, not because
         | they're afraid of the water.
        
         | livelielife wrote:
         | is dna hardware? software?
         | 
         | it's both! it's neither! oh, and it's also the runtime!
        
       | fjfaase wrote:
       | Bert Hubert has an interesting idea about the reasons for
       | interons. He explains this in his talk 'DNA: More Greatest Hits
       | (SHA2017)' The interesting bit, with some introduction, starts
       | at: https://youtu.be/rCdhsN--Mdo?t=1440
       | 
       | This is a follow-up talk to his talk: 'DNA: The Code of Life'
       | https://www.youtube.com/watch?v=EcGM_cNzQmE
        
       ___________________________________________________________________
       (page generated 2023-03-30 23:01 UTC)