[HN Gopher] A biological camera that captures and stores images ...
       ___________________________________________________________________
        
       A biological camera that captures and stores images directly into
       DNA
        
       Author : giuliomagnifico
       Score  : 171 points
       Date   : 2023-07-10 08:26 UTC (14 hours ago)
        
 (HTM) web link (www.nature.com)
 (TXT) w3m dump (www.nature.com)
        
       | ryanjamurphy wrote:
       | Related: The Verge's DNA time capsule [0].
       | 
       | [0]: https://www.theverge.com/c/22173998/dna-time-capsule
        
       | gsam wrote:
       | Anyone else think that there is already primitive image data
       | encoded in biological data? Essentially basic shapes and patterns
       | which are passed down semi-generationally.
        
         | dekhn wrote:
         | I do not know of any true "image" data. Most complex patterns
         | in nature are created by generative processes rather than
         | direct encoding.
        
         | stevezsa8 wrote:
         | I guess it's possible if this conferred some survival
         | advantage.
         | 
         | It can be useful to work from the evidence to a conclusion
         | instead of the other way round.
         | 
         | But wondering and philosophising can be fun :]
         | 
         | It would be cool if humans could pass knowledge via their
         | offspring. But I always get worried thinking if I'm the
         | asshole, I wouldn't want my kid to be one too.
        
           | anigbrowl wrote:
           | _I always get worried thinking if I 'm the asshole, I
           | wouldn't want my kid to be one too._
           | 
           | If you were, you would.
        
           | yieldcrv wrote:
           | Just has to not be a disadvantage.
           | 
           | Plenty of mutations have no purpose whatsoever and were
           | unrelated to survival or manifest after reproduction so are
           | not selected for or against.
        
         | dghughes wrote:
         | Anton Petrov has a recent video on YouTube I never watched it
         | yet but it's title is "Could Life Be Transmitted Via Radio
         | Waves? Information Panspermia". Just a bit of fun I'm sure
         | Anton isn't too wild he puts out some interesting videos but
         | not in a way to push quackery.
         | 
         | https://www.youtube.com/watch?v=K4Zghdqvxt4
        
           | nico wrote:
           | Fascinating, thank you
           | 
           | Recently here on HN someone posted a quote saying something
           | like "if you shine light at something for a long enough time,
           | don't be surprised if you end up getting a plant"
           | 
           | It was about how the environment seems to reorganize in
           | certain ways to use up energy (the latest Veritasium video
           | about entropy also talks about this)
        
         | f6v wrote:
         | I think it would have very high energy requirements. For this
         | trait to survive over generations there would need to be a
         | tremendous evolutionary benefit. What would that be for a
         | "primitive image data"?
        
           | yetihehe wrote:
           | Maybe things like "long green shape" (cats' fear of cucumbers
           | because they resemble snakes), or "a series of black and
           | yellow stripes", or even "a black blob with many appendages"
           | to watch out for spiders? Encoding some primitive image data
           | so that further generations know what to avoid or pursue
           | seems like a very tremendous evolutionary benefit.
        
             | techdragon wrote:
             | Yeah, I expect this isn't going to be how that sort of
             | mechanism works, but it's always been an interesting
             | concept for me, that while "genetic memory" as presented in
             | much fiction is extremely unlikely just from the sheer
             | entropic hill such mechanisms would have to evolutionarily
             | climb to be able to pass on so much information (on top of
             | the baseline necessary information for reproduction, the
             | majority of memory won't on average confer a lot of
             | reproductive advantages, so it's statistically more likely
             | to get optimised out by the random mistakes of evolution,
             | hence entropically "uphill") ...
             | 
             | Yet while this fictional form is unlikely we have quite a
             | lot of good examples and evidence for "inherited
             | information". You have to be careful with it since it's too
             | easy to accidentally include side channels for organisms to
             | learn the information and thus break the test. Such as
             | insects being genetically driven towards food by smell at a
             | molecular chemical interaction level, and the smell
             | becoming associated with the information you wish to test.
             | A bee colony can't be reliably tested unless you raise it
             | from a new queen in an odourless environment if you wish to
             | see if bees genetically know that the shape of a flower is
             | associated with food. It's tough to subtract the potential
             | that a colony will have learned and "programmed" later
             | generations of bees with things like the classic waggle
             | dancing in order to more efficiently gather food.
             | 
             | We do have good ones though like cats and snake shaped
             | objects, it's surprisingly consistent, and pops up in some
             | other animal species. It's wired into our brains a bit to
             | watch out for such threats. There's a significant bias
             | towards pareidolia in human brains and it's telling how
             | deeply wired we have some of these things, but it is there
             | and study shows it seems to form well before our cognitive
             | abilities do... these all have some obvious reproductive
             | advantages however so it makes sense that the "instinct"
             | would be preserved over generations as it confers an
             | advantage. But it's still impressive that it can encode
             | moderately complex information like "looks like the face of
             | my species" or "cylindrical looking objects on the ground
             | might be dangerous"... even if it's encoded in a lossy
             | subconscious instinctual level.
        
               | TeMPOraL wrote:
               | > _But it's still impressive that it can encode
               | moderately complex information like "looks like the face
               | of my species" or "cylindrical looking objects on the
               | ground might be dangerous"... even if it's encoded in a
               | lossy subconscious instinctual level._
               | 
               | I think it helps that the encoding does not have to be
               | transferable in any way. This kind of "memory" has no
               | need for portability between individuals or species - it
               | doesn't even need to be factored out as a thing in any
               | meaningful sense. I.e. we may not be able to isolate
               | where exactly the "snake-shaped object" bit of instinct
               | is stored, and even if we could, copy-pasting it from a
               | cat to a dog wouldn't likely lead the (offspring of the)
               | latter to develop the same instinct. The instinct
               | encoding has to only ever be compatible with one's direct
               | offspring, which is a nearly-identical copy, and so the
               | encoding can be optimized down to some minimum tweaks -
               | instructions that wouldn't work in another species, or
               | even if copy-pasted down couple generations of one's
               | offspring.
               | 
               | (In a way, it's similar to natural language, which
               | rapidly (but not instantly) loses meaning with distance,
               | both spatial/social and temporal.)
               | 
               | In discussing this topic, one has to also remember the
               | insight from "Reflections on Trusting Trust" - the
               | data/behavior you're looking for may not even be in the
               | source code. DNA, after all, isn't _universal, abstract
               | descriptor of life_. It 's code executed by a complex
               | machine that, as part of its function, copies itself
               | along with the code. There is lots of "hidden"
               | information capacity in organisms' reproduction
               | machinery, being silently passed on and subject to
               | evolutionary pressures as much as DNA itself is.
        
               | techdragon wrote:
               | Oh absolutely... and that's a great analogy for the more
               | computer oriented, "Reflections on Trusting Trust"
               | highlights how it can be the supporting infrastructure of
               | replication that passes on the relevant information... a
               | compiler attack like that is equivalent to things like
               | epigenetic information transfer... and for fun bonus
               | measure since it came to mind... the short story Coding
               | Machines goes well for really helping to never forget the
               | idea behind "Reflections on Trusting Trust"
               | https://www.teamten.com/lawrence/writings/coding-
               | machines/
               | 
               | It definitely would be minimised data transfer, be it via
               | an epigenetic nudge that just happens to work by sheer
               | dumb luck because of some other existing mechanism or a
               | sophisticated DNA driven growth of some very specific
               | part of the mammalian connectome that we do not yet
               | understand because we've barely got the full connectome
               | maps of worms and insects, mammals are a mile away at the
               | moment... no matter the mechanism evolution will have
               | optimised it pretty heavily for simply information
               | robustness reasons, fragile genetic/reproductive
               | information transfer mistakes that work, break and get
               | optimised out in favour of the more robust ones that
               | don't break and more reliably pass on their advantage.
        
             | f6v wrote:
             | You need to compare that with an alternative solution where
             | this information is learned by each generation and then
             | asses the survival advantage of having it encoded in DNA.
             | This is outside my field and I don't have a strong opinion.
        
       | xk_id wrote:
       | Once upon a time, the Wikipedia article about Hacking provided
       | the following as sort of "canonical" example of hacking: using an
       | optical mouse as barcode scanner. In some ways, this incredible
       | paper feels like an iteration of that example.
        
       | throwawaymaths wrote:
       | Man what a confusing title. It's not a single strand of DNA that
       | gets the image information. You get a pool of DNA, which
       | collectively hold the images information.
       | 
       | This is done in a pretty obvious way, each "pixel" is a well in a
       | 96-well e.g. plate and you expose the bacteria in these wells to
       | different light and then the DNA transformation is triggered by
       | the light, then you harvest the DNA from the bacteria and get
       | your image pool library.
        
         | PaulHoule wrote:
         | Would be neat if you could somehow splice them into a sequence
         | but I think you'd have some alternating sequences that
         | determine position and sequences that really code information.
        
           | aclatuts wrote:
           | Assassins Creed is becoming reality soon.
        
           | throwawaymaths wrote:
           | yes, I was very impressed by the title, but once I dug into
           | this it was sort of like "well we could probably have
           | accomplished this ~10 years ago when optogenetics first came
           | out". Definitely a situation where branding the title got
           | something "so silly that no one did it before" got it to be
           | noticed.
        
       | JumpCrisscross wrote:
       | Something like this was posited in Banks' _Excession_ , and I
       | remember thinking how advanced passing messages via DNA embedded
       | in bacterial seemed.
        
         | stevenwoo wrote:
         | It's also a key plot point in Tchaikovsky's recent Children of
         | ... trilogy.
        
       | boffinAudio wrote:
       | [flagged]
        
         | muzani wrote:
         | [flagged]
        
       | hallihax wrote:
       | This is great and all, but the merge process is still messy as
       | hell
        
       | [deleted]
        
       | jvanderbot wrote:
       | Can't wait until they do a worldwide investigation to find
       | patient zero from the selfie encoded in the next pandemic.
        
         | codetrotter wrote:
         | > Can't wait until they do a worldwide investigation to find
         | patient zero from the selfie encoded in the next pandemic.
         | 
         | Then they realise the picture is this dude:
         | https://blogs.loc.gov/loc/2022/07/robert-cornelius-and-the-f...
        
       | asimpletune wrote:
       | Something that I always found fascinating is how DNA is a base 4
       | information format. There's this thing called radix economy,
       | which is basically an expression of how efficient a number system
       | is. Base e is the theoretical maximum, and so base 3 is the
       | closest integer.
       | 
       | Obviously if you have a special use case, then that may dominate
       | your radix economy (like hex, b64, etc...), but for general
       | purpose information purposes, the order base 3, base 4, then base
       | 2.
       | 
       | This present a lot of interesting questions to me. Like, why
       | didn't DNA end up as base 3? (probably because 4 naturally lends
       | itself to pairs of 2).
       | 
       | Also, this idea of radix economy goes beyond just the encoding of
       | information and is represented in logical economy as well. So for
       | example, ternary logic is (much) more efficient than binary
       | logic. Having that 3rd state just makes problem solving much more
       | elegant.
       | 
       | To that end, I have always wondered how nature has exploited this
       | 4-state number system logically. Like, are there all sorts of
       | exotic logic gates that come from a 4 state system?
        
         | dahfizz wrote:
         | Why do my eyelashes, meant to protect my eyes, fall into my
         | eyes? Why do my cheeks/tongue sometimes get in the way of my
         | teeth so that I bite them? And why do they then get inflamed so
         | that I continually byte them for the next few days?
         | 
         | We are all a bunch of biological goop resulting from random
         | processes. Don't expect optimal solutions from evolution. There
         | is no "why".
        
         | lurknot wrote:
         | https://en.wikipedia.org/wiki/Chargaff%27s_rules
        
         | dylan604 wrote:
         | >So for example, ternary logic is (much) more efficient than
         | binary logic. Having that 3rd state just makes problem solving
         | much more elegant.
         | 
         | Binary for electronics is obvious because there are 2 states in
         | electric components: on or off. There is no 3rd option.
        
           | nicoburns wrote:
           | I believe "on" and "off" in electronics typically correspond
           | to different voltage levels. So you absolutely could have a
           | third intermediate state if you wanted to. Flash memory does
           | this (and even sometimes has 4 states). I guess designing
           | switches (transistors) that could take advantage of and
           | propagate these extra states could be tricky though.
        
             | ohwellhere wrote:
             | https://en.wikipedia.org/wiki/Ternary_computer
        
         | guerrilla wrote:
         | > (probably because 4 naturally lends itself to pairs of 2).
         | 
         | Why would pairs of two be favorable?
         | 
         | > So for example, ternary logic is (much) more efficient than
         | binary logic. Having that 3rd state just makes problem solving
         | much more elegant.
         | 
         | What do you have in mind here?
         | 
         | > Like, are there all sorts of exotic logic gates that come
         | from a 4 state system?
         | 
         | I don't know but you may be interested in this [1].
         | 
         | 1. https://en.wikipedia.org/wiki/Catu%E1%B9%A3ko%E1%B9%ADi
        
         | PaulHoule wrote:
         | DNA has the same limitation that many serial protocols have: if
         | you repeat the same base pairs (e.g. "AAAAAAAAAAAAAAAAAAAAAA")
         | you will have trouble w/ the DNA not spiraling correctly. Some
         | sequences of 2-6 repeated base pairs seem to "deliberately"
         | cause variant behavior in DNA and RNA, see
         | 
         | https://en.wikipedia.org/wiki/Repeated_sequence_(DNA)
         | 
         | Many real wire protocols have mechanisms to prevent repeated
         | sequences entirely
         | 
         | https://en.wikipedia.org/wiki/8b/10b_encoding
         | 
         | DNA coding for real proteins is unlikely to be too terribly
         | repetitive but I image a long a helix could have a repetitive
         | amino acid sequence. Many amino acids can be coded with variant
         | codons, I guess if repetition were a problem in a particular
         | gene natural selection could step in.
        
           | [deleted]
        
         | jszymborski wrote:
         | There's a lot of interesting things to consider.
         | 
         | One, is that base 4 makes a lot of sense for the stability of
         | DNA structures. You have two purines, two pyrimadines.
         | 
         | Another is that partly because codons are degenerate, the
         | distribution is way off a uniform distribution. For chemistry
         | and mol bio reasons, the distribution of AGTC is very skewed.
         | 
         | When i fully wake up, this might be a fun blog post to draft.
        
         | Llamamoe wrote:
         | Radix is important for digit-efficiency, but in a biological
         | system that's not necessarily related to molecule size
         | efficiency.
        
           | pythonguython wrote:
           | I'm also failing to see how digit efficiency would be
           | important in DNA. In fact, it seems that a high base system
           | would be more efficient. If you had 80 nucleobases instead of
           | 4, each base pair would contain far more information
        
             | asimpletune wrote:
             | The efficiency comes from the ratio of the alphabet to the
             | number of character places needed to express them.
             | Otherwise why not base a million? Or a billion?
             | 
             | This ratio is what leads to base e being the theoretical
             | maximum.
        
             | JumpCrisscross wrote:
             | > _If you had 80 nucleobases instead of 4, each base pair
             | would contain far more information_
             | 
             | Which is a problem given DNA is a lossy format.
        
         | toufka wrote:
         | You (at least) have 3 systems that are optimized in concert in
         | a (our) DNA/Protein world.
         | 
         | DNA base set, Amino acid set, Translation layer between
         | DNA/Proteins.
         | 
         | Currently, we've got: 4 DNA bases, 3 bases/AA, 20 AAs; 4^3 =>
         | 20
         | 
         | If you change one of those numbers, you'll need to rejigger the
         | rest, and you'd need to reoptimize. And there are competing
         | goals which at least include: - maximize access to
         | biophysical/chemical diversity - minimize energy expenditure to
         | produce each component, chemically - minimize energy
         | expenditure to both copy instructions & produce products -
         | maximize information fidelity - minimize or at least degrade
         | gracefully in the context of errors
         | 
         | In the context of a 3-base system, you very well could throw
         | off those optimizations given the consequences for the other 2
         | parameters (#AA & nt/AA). 3^3 = 27, which is very close to the
         | maximum of 20 amino acids. Which means you'd probably need a
         | 4nt->AA translation layer to keep the same number of AAs, and
         | that alone would add 30% more energy expenditure. If you kept
         | the 3nt->AA system you'd BOTH need to reduce the number of
         | accessible amino acids AND you'd lose some of the error
         | correction mechanisms of having degenerate codons code for the
         | same amino acid.
        
         | icoder wrote:
         | DNA is not really processed like that, afaik. Mostly, each 3
         | bases code for an amino acid, which are glued together to a
         | string (protein), which folds in a 3D structure based on the
         | characteristics of all amino acids.
         | 
         | Some DNA is used to attract other proteins, or even interact
         | with DNA elsewhere on the strand, or is translated to RNA (one-
         | on-one) which can then have a function based on its sequence or
         | the structure it folds into.
         | 
         | Any 'logic' there is, is built _on top_ of this.
        
           | jacquesm wrote:
           | Any logic that we are currently aware of. DNA contains many
           | unsolved mysteries and I expect that gift to keep on giving
           | for a long time to come.
        
         | go_elmo wrote:
         | Might error-correction play a role? Having a lightly
         | inefficient base 4 system might provide capacity for the
         | surplus error correcting code information capacity?
        
           | Faaak wrote:
           | One mutation in a base pair can lead a totally different
           | amino acid (c.f the genetic code), so I doubt it ?
        
             | toufka wrote:
             | BUT, if you look at the codon table, precisely because it's
             | base-4 and not base-3, many base flips are silent when
             | coded.
             | 
             | By using base-4, there's enough space to permit lossiness
             | of the coding itself - given the number of amino acids and
             | the 3-NT encoding.
             | 
             | So you really aren't optimizing JUST for nucleotide
             | encoding, but you're also optimizing in concert with
             | 3-nt/AA, and 20AA codes.
             | 
             | So if you have to optimize for information density and
             | fidelity, given X-nucleotides, Y nucleotides/AA, and Z AAs,
             | and sample as much chemical and physical diversity in those
             | AAs life has settled upon: X=4, Y=3, Z=20.
             | 
             | If we went with X=3, you might need Y=4 to get the same
             | kind of fidelity, but that cranks up your energy costs by
             | 30% (from 3 to 4 NT per AA).
        
             | go_elmo wrote:
             | True, iff the error persists correction cycles which are
             | present, how exactly they work / if theyre comparable with
             | eccs I dont know.
        
           | icoder wrote:
           | DNA mostly relies on the fact that there's 2 strands that are
           | (logically speaking) a mirror copy of each other (a C is
           | paired with a G and vice versa, an A to a T and vice versa),
           | it's like RAID 3 with only 2 disks (one being parity).
           | 
           | Apart from repairing structural damage such as missing bonds,
           | the cell can even repair missing bases or non-straight breaks
           | without loss. This mechanism is also used for replication:
           | the entire strand is split and each half is completed with
           | its mirror counterpart.
        
             | go_elmo wrote:
             | Im aware of that, but was rather thinking about ECCs like
             | hamming-code, that are able to correct single sequences of
             | info based on surplus info in that same string.
        
               | dekhn wrote:
               | Nothing algorithmically sophisticated, but DNA repair
               | enzymes already do this.
        
         | usrbinbash wrote:
         | > Like, why didn't DNA end up as base 3?
         | 
         | Why did we end up with only 20 proteinogenic amino acids? Why
         | are vertebrate neural architectures inverted (cell bodies on
         | the inside, connections on the outside, even though the other
         | way round way (eg. like a squids brain is organised) is easier
         | and less inhibitive to growth?
         | 
         | 2 Reasons:
         | 
         | a) Because nature and evolution cannot engineer. Random
         | mutation, recombination and natural selection are the only
         | mechanisms available. Things get selected if they outcompete
         | existing alternatives, they don't need to be the best
         | solutions.
         | 
         | b) All solutions have to be built by modifying what already
         | exists. Evolution doesn't get to do greenfield projects,
         | because anything that has to start from scratch is so
         | disadvantaged in natural selection compared to already evolved
         | complex life, it will fail.
         | 
         | This leads to systems that, from an engineering point of view,
         | don't always make a lot of sense.
         | 
         | Eg. the architecture of the vertebrate neural system creates a
         | lot of issues (eg. our light sensitive cells point in the wrong
         | direction). The only way this makes any sense if when one looks
         | at how the neural tube (the precursor to the backbone) is
         | formed by the endodermis folding in on itself. This process is
         | so deeply at the root of the Chordata, and so many other things
         | depend on it, that it simply cannot change any more.
         | 
         | Many many biological systems are "legacy systems" in the truest
         | sense of the word: Solutions produced a long time ago that may
         | have many problems, but are simply too deeply enmeshed with
         | everything that came after, that they are now impossible to
         | change.
        
           | function_seven wrote:
           | > _(eg. our light sensitive cells point in the wrong
           | direction)_
           | 
           | Can you expand on that? Are you talking about front-facing
           | eyes vs. birds' eyes? Or something else like retinal
           | structure?
        
             | dtgriscom wrote:
             | Retinal structure:
             | 
             | https://en.wikipedia.org/wiki/Retina#Inverted_versus_non-
             | inv...
        
             | metabagel wrote:
             | I had to look this up, and I guess what usrbinbash was
             | referring to was the layout of the retina, which places the
             | rods and cones behind layers of transparent neurons.
             | 
             | https://en.wikipedia.org/wiki/Retina#/media/File:Retina-
             | diag...
             | 
             | Edit: ninja'd
        
               | dekhn wrote:
               | Yet, it doesn't really have a strong impact as it's been
               | determined that humans can see individual photons and we
               | aren't dependant on night vision for hunting.
        
           | t_serpico wrote:
           | A classic armchair response. DNA has complementary
           | nucleotides (AT,GC) that facilitates its pairing. Base 3
           | wouldn't work in that sense. Also, you can't forget about the
           | genetic code. See https://arxiv.org/pdf/q-bio/0605036.pdf for
           | interesting thoughts. Remember, evolutionary biology is a
           | field and people think about these questions!
        
             | usrbinbash wrote:
             | > Base 3 wouldn't work in that sense.
             | 
             | That's true, but a) not the point I am making, and b) I am
             | pretty sure it says nowhere in my post that it would.
        
             | idiotsecant wrote:
             | This is pretty smug for someone who seems to have managed
             | to miss the point entirely. Yes, DNA has certain features
             | that require a base 4 system. That is not necessarily true
             | of all possible systems with DNA-equivalent function, which
             | is the point this whole thread is making.
        
               | hotstickyballs wrote:
               | If you iron man the argument then it's an error
               | correction argument in that this simple ecc method can be
               | what favours a base-4 encoding instead
        
               | t_serpico wrote:
               | How have I missed the point? The answer that nature
               | cannot engineer and can't start de novo are trivially
               | true statements that provide no actual insight into the
               | question. I fully agree the original question itself is a
               | deep one. A quick literature search is more productive
               | than pontificating with weak analogies. See https://www.m
               | ath.unl.edu/~bdeng1/Papers/DengDNAreplication.p... for
               | what seems to be an interesting analysis regarding base
               | number and DNA replication rate.
        
               | usrbinbash wrote:
               | > that provide no actual insight into the question
               | 
               | Mind elaborating on that?
               | 
               | Because there is no biochemical reason why DNA could not
               | have incorporated, say, a third pairing pair, so while
               | base-3 (which I don't specifically mention in my post
               | btw.) wouldn't work, base 6 or 8 would have been
               | possible. "Unnatural Base Pairs" are even known to work
               | in laboratory settings.
               | 
               | There is also no biochemical reason why base2 life
               | wouldn't work. Expand the reading frame of the
               | translation machinery to 5 instead of three, and you have
               | enough coding space for polypeptides.
               | 
               | My answer adresses the question completely, because the
               | only reason behind these "decisions" is an ancient system
               | that simply got "frozen", and now cannot change any more.
        
               | sterlind wrote:
               | _> There is also no biochemical reason why base2 life
               | wouldn 't work._
               | 
               | are you sure about that? are you sure there's no weird
               | effects that might destabilize very long sequences of
               | 2-nucleotide DNA? or on how wide DNA-binding domains have
               | to be to cope with reduced information density, and how
               | that might sterically hinder smaller arrangements of
               | proteins?
               | 
               |  _> My answer adresses the question completely, because
               | the only reason behind these  "decisions" is an ancient
               | system that simply got "frozen", and now cannot change
               | any more._
               | 
               | your answer is just a hypothesis, not a proof. these
               | things can be studied (by studying abiogenesis in-vitro),
               | and it's not certain these decisions were "flash frozen"
               | like you describe. 2-, 4-, and 6- nucleotide coding
               | systems might have coexisted in the RNA world, and 4-
               | could have won out for some reason.
        
           | sparrowInHand wrote:
           | Short answer: Likelihood of noise (brownian motion) producing
           | the element and keeping it interacting. Then once it gets
           | going, likelihood of keeping state, while interacting.
        
         | throwawaymaths wrote:
         | It's probably not base four because you have to stretch out
         | more pairs to match up four pairs and that's entropically
         | disfavored. However ribosomes can accomodate a four pair
         | matching, though at a very reduced yield (unless you think
         | Schultz's postdoc fabricated those data)
        
         | asdff wrote:
         | On paper this might be an interesting game, but you have to
         | think of things in terms of crystal structure, what is able to
         | form hydrogen bonds, what ends up being sterically hindered and
         | what that means for the molecule. This is why watson and crick
         | and franklin's work was so seminal, it showed how genetic
         | information was inherited through mechanical logic of these
         | molecules alone. Before the structure of DNA was solved, there
         | were a lot of competing theories over what molecule was the
         | source of heritable information, and how this information was
         | exactly passed down between generations.
        
       | p0w3n3d wrote:
       | Exploit idea: create an image, which, when taken a shot of, would
       | be written to DNA as a virus.
       | 
       | (I know viruses are RNA)
        
         | pyinstallwoes wrote:
         | Do a Quine now!
        
         | luckystarr wrote:
         | And create a never to be deleted record of images across the
         | infected population?
        
           | dormento wrote:
           | Cue "tasteless porn in bitcoin blockchain forever".
        
             | TeMPOraL wrote:
             | Luckily, DNA is mutable over generations, so all such noise
             | can and will eventually be filtered out.
        
         | dillydogg wrote:
         | There are plenty of DNA viruses. They aren't limited to RNA at
         | all
        
       | swamp40 wrote:
       | I've always wondered if the plant/animal shapes and sizes were
       | represented literally in a 3D mapping of the DNA. Like we could
       | already have a picture of what it will become, if we could just
       | decode the DNA sequences properly.
       | 
       | Like we have the sequence of numbers for a jpg, but we've never
       | seen the picture.
        
       | K0balt wrote:
       | Am I alone in thinking that using regular DNA is a terrible idea
       | for data storage?
       | 
       | I mean, that would make your storage medium a potential
       | biohazard. Although it probably would all be cool until someone
       | put smallpox.bin on a major torrent tracker.
       | 
       | If it's really that good, we should come up with a variant using
       | slightly different chemistry so that biocontamination is not a
       | factor.
        
         | throwawaymaths wrote:
         | > that would make your storage medium a potential biohazard
         | 
         | generally no. If you're worried about random DNA being a
         | biohazard there are way worse things to worry about, like how
         | your immune system uses random stretches of biologically primed
         | dna to create antibody diversity.
         | 
         | The real reasons why it's terrible is that write speed is
         | atrocious and read speed is bad (on the order of 2-3x that of
         | amazon glacier's robotic tape handlers, with WAY MORE expensive
         | robots, and way more expensive cost to read -- you're bulk
         | polluting rivers in china to make the reagents).
         | 
         | The only use case I can think of is deep generational archival
         | (like the svalbard seed bank, but for information). Where cost
         | to store by volume is at a premium, and where you'd like to
         | have many many many copies, and you don't mind the cost to read
         | because you won't be reading it but for every 10 or so years,
         | if even.
         | 
         | Store your logs in DNA. You're never going to read them
         | anyways.
        
           | asdff wrote:
           | Having DNA as a storage medium is the best way to store
           | actual biological data. Currently, we do things like having
           | seedbanks, which need periodic replacing as seeds grow to be
           | nonviable. A library of genomes is a much smaller physical
           | footprint than a seedbank. It doesn't need periodic
           | replacing, provided its not getting bombarded by radiation or
           | anything unusual like that. DNA doesn't even have to be
           | stored frozen; you can freeze dry it and store it at room
           | temp for a very long time before any significant degradation.
           | You can also just have the sequence stored digitally, and
           | synthetically build out the dna molecule as you need it (I
           | think this is still pretty costly though and not that
           | efficient). With the right molecular biological tooling, one
           | could conceivably introduce these genomes into a plant cell
           | line and grow them up in tissue culture, you don't have to
           | for example grow a tree and let it mature and go to seed
           | since plant cells are pluripotent, everything can be done in
           | a lab much faster.
        
             | throwawaymaths wrote:
             | > Having DNA as a storage medium is the best way to store
             | actual biological data
             | 
             | 100% agreed. I thought that was obvious, was mostly sniping
             | at "DNA based digitalstorage startups", thanks for
             | clarifying for me
        
         | fartsucker69 wrote:
         | imagine downloading a tv show and it turns out its shit
        
       | justsocrateasin wrote:
       | Reminiscent of a very interesting company I interviewed for last
       | year called Cache DNA
       | 
       | https://www.cache-dna.com/
       | 
       | This is the future. I don't think it will look exactly like this,
       | and I don't think it will be here any time soon, but I'm excited
       | to see these advancements.
       | 
       | What Cache is doing presently is trying to do archival storage in
       | DNA - it has a lot of potential to be cheaper, more energy
       | efficient, and more redundant. But some of the processes still
       | aren't there yet.
        
         | ray__ wrote:
         | Even just storing family photos would require DNA sequences
         | that are orders of magnitude larger than the human genome, so
         | you're going to be looking at very expensive or very time
         | consuming read/write (and certainly no instant read write at
         | any cost-the turn around time can't be less than hours, even
         | for small files, even with high-end HTS or nanopore approaches
         | afaik). What is the plan for getting around this?
        
       | LearningToWalk wrote:
       | I've been up close to one project working in this space. The
       | obstacles are obviously many, but fascinating to see that
       | progress is made nontheless. Clearly a piece of the-future-
       | puzzle.
        
       | js8 wrote:
       | It reminds me of children story
       | https://en.m.wikipedia.org/wiki/The_Mystery_of_the_Third_Pla...,
       | which had flowers that captured the surroundings in layers, like
       | a film camera.
        
       | stainablesteel wrote:
       | this is like spy technology, super cool
        
       | OscarTheGrinch wrote:
       | Man, I have a hard enough time trying to keep track of a micro SD
       | card, imagine misplacing your DNA based files?
       | 
       | Seriously tho, using DNA as an information storage medium is a
       | pretty neat concept.
        
         | Borrible wrote:
         | >using DNA as an information storage medium is a pretty neat
         | concept
         | 
         | And billions of years old.
        
           | DeathArrow wrote:
           | > And billions of years old.
           | 
           | With not quite good backup strategies.
        
             | dylan604 wrote:
             | Could make for some interesting decoding errors as your
             | original data mutates
        
             | asdff wrote:
             | Having an error rate means a small chance of gaining an
             | edge that makes up for having it.
        
             | Borrible wrote:
             | DNA/RNA looks to be more like storing heuristics, landmarks
             | and clues, not data.
        
               | snitty wrote:
               | Can you elaborate on that? A significant portion of DNA
               | in organisms literally encodes for protein sequences. It
               | also has functional parts (binding sites for proteins,
               | promoter sequences). Some RNAs are not translated because
               | the RNA itself has function, but I don't see that same
               | argument for DNA.
        
               | asdff wrote:
               | Only like 1.5% of the human genome is protein coding.
        
             | Scarblac wrote:
             | If you create your data right, _the actual data_ can make
             | backups of itself. There 's even builtin ways for it to
             | improve itself over time using genetic algorithms.
        
               | TeMPOraL wrote:
               | A kind of implied meaning of the term "data", especially
               | in context of storage and archiving, is that we _do not_
               | want it to  "improve itself".
        
               | urfullofsht wrote:
               | [dead]
        
           | Daub wrote:
           | Its not often that HN makes me laugh.
        
             | jdsalaro wrote:
             | You could even say we're talking about a legacy storage
             | medium ;)
        
               | jacquesm wrote:
               | Ok, who ate the family movie archive?
        
       | candiodari wrote:
       | Just so we're clear, this is ONE pixel per, I don't know, 10000
       | cells or so. So one bit per DNA chain, with that bit repeated
       | thousands of times to get redundancy. Still and incredible
       | achievement.
        
         | fxtentacle wrote:
         | The super neat thing is that they tag each DNA chain with the
         | pixel coordinates, so you can afterwards mix those 10,000 DNA
         | strands each for all 96 pixels into one 1-mio-DNA-strand-soup
         | and still recover the image successfully.
        
           | jojobas wrote:
           | >successfully
           | 
           | Except when it recombines weirdly and gets mixed up as per
           | the article.
        
             | tough wrote:
             | Those are just mutations in the image
        
               | TeMPOraL wrote:
               | "Hallucinations" seems to be the modern term.
               | 
               | /s, but only slightly.
        
       | usrbinbash wrote:
       | > DNA synthesis remains a bottleneck in the adoption of DNA as a
       | data storage medium.
       | 
       | Yes, one of many.
       | 
       | Another one is a simple question: _What exactly is the use case
       | again?_ Because, storage isn 't something we lack. Especially
       | when talking about storage where, obviously, fast random access
       | isn't a requirement, aka. data archiving.
       | 
       | We have good solutions for that; an LTO-9 tape can hold 18TiB of
       | data native and up to 45 TiB of data compressed, with denser
       | capacities planned: https://en.wikipedia.org/wiki/Linear_Tape-
       | Open
        
         | pcrh wrote:
         | Encode wikipedia into DNA, then insert it into a horseshoe
         | crab. In a few million years it may still be around to be
         | decoded.
         | 
         | >The fossil record of Xiphosura goes back over 440 million
         | years to the Ordovician period, with the oldest representatives
         | of the modern family Limulidae dating to approximately 250
         | million years ago during the Early Triassic. As such, the
         | extant forms have been described as "living fossils".[9]
         | https://en.wikipedia.org/wiki/Horseshoe_crab
        
         | Aardwolf wrote:
         | Some number I found online, while trying to multiply the 30
         | trillion human cells with the data storage of DNA per cell:
         | 
         | "one gram of dried DNA can store 455 exabytes of data"
         | 
         | Seems like a pretty sweet use case to me!
         | 
         | I definitely do lack storage by the way. Say I want to download
         | the common crawl data set, 380 TiB. And for redundancy I'd need
         | multiple copies of the data too. That's a lot of disks for in
         | the home. "18TiB ought to be enough for everyone" really
         | doensn't cut it.
        
           | usrbinbash wrote:
           | > one gram of dried DNA can store 455 exabytes of data
           | 
           | Yes, and half a gram of Hydrogen could produce ~500 Megawatts
           | of power in a fusion reactor. However, that theoretical value
           | will remain irrelevant, as long as we cannot build a
           | practically useful fusion reactor. And even if we could build
           | one, it still has to compete with all other forms of
           | producing power for scalability, reliability, efficiency and
           | cost.
           | 
           | The fact that there is a very high theoretical number that
           | seems really impressive, isn't a use case.
           | 
           | So, with that being said: how long does it take to write
           | these 455EiB? How long does it take to read them? How error
           | prone are both processes? And how much does it cost to
           | write/read them?
           | 
           | > "18TiB ought to be enough for everyone" really doensn't cut
           | it.
           | 
           | Pretty sure I never said that.
           | 
           | Also pretty sure common crawl can be compressed. Even
           | assuming only a 2:1 compression rate, that means it fits
           | comfortably on 11 LTO-9's. Now, a quick google-search churned
           | out tape prices of about 110-140 $ per LTO-9. Let's say ~150$
           | per tape, that means the whole thing fit's on 1650 $ worth of
           | storage. About 5000 bucks with 2 backups included. Double
           | that for uncompressed storage.
           | 
           | Alright, so how does that compare to DNA storage?
           | 
           | https://www.nanalyze.com/2023/03/dna-data-storage-solution/
           | 
           | quote:
           | 
           |  _These days, it costs $600 to sequence a complete genome
           | which contains around 200 gigabytes of data or about $3 per
           | gig. Today, magnetic tape technology offers the lowest
           | purchase price of raw storage capacity at around two cents
           | per gigabyte_
           | 
           | end quote.
           | 
           | So just _reading_ the 380 TiB back from uncompressed storage
           | _ONCE_ , would cost ~1,140,000 dollars.
           | 
           | And that's just for reading. At a price differential that is
           | measured in multiple orders of magnitude, a technology better
           | offer some REALLY good, REALLY tangible advantages to
           | compete.
        
             | Aardwolf wrote:
             | I of course wouldn't want to store my data in there today,
             | I wouldn't even trust that I get it back reliably because
             | DNA reading comes with a relatively big error rate for
             | storage purposes (of course error correction can mitigate
             | that). But it would be cool if the technology progresses.
             | All technology, including disks, magnetic tapes, and new
             | alternatives. Whether DNA is viable in the end or not, I
             | don't know. I do know that tech always has been progressing
             | and new alternatives are sometimes found, and that I do see
             | a use for more storage.
             | 
             | But an argument whether DNA is a viable option in the
             | future or not would have to say technically what the issue
             | of DNA is with future tech.
             | 
             | Whether it's more expensive today, or that there's no need
             | for more data today, are not really arguments against it.
             | 
             | I do not intend to be arguing for snake oil or anything
             | here though. If "DNA storage" is in a similar category of
             | "perpetual motion machines" and "cars that run on tap
             | water" then count me out.
             | 
             | I don't even know how our comments ended up being like
             | arguing against each other. The only thing really I didn't
             | agree with in the original comment was "Because, storage
             | isn't something we lack", because I do find it lacking,
             | both at home and in the cloud.
        
         | speed_spread wrote:
         | Let's take a moment to appreciate how the classic "bandwidth of
         | station wagon filled with tapes" scales with tape technology.
         | Too bad there aren't many station wagon choices nowadays but I
         | guess any minivan would do in a cinch.
        
         | piyh wrote:
         | Can your tapes self replicate? /s
         | 
         | The goal of replacing memory cards is dumb, the tech that
         | enables the storage is a foundational step forward in bio
         | engineering.
        
       ___________________________________________________________________
       (page generated 2023-07-10 23:01 UTC)