[HN Gopher] A biological camera that captures and stores images ...
___________________________________________________________________
A biological camera that captures and stores images directly into
DNA
Author : giuliomagnifico
Score : 171 points
Date : 2023-07-10 08:26 UTC (14 hours ago)
(HTM) web link (www.nature.com)
(TXT) w3m dump (www.nature.com)
| ryanjamurphy wrote:
| Related: The Verge's DNA time capsule [0].
|
| [0]: https://www.theverge.com/c/22173998/dna-time-capsule
| gsam wrote:
| Anyone else think that there is already primitive image data
| encoded in biological data? Essentially basic shapes and patterns
| which are passed down semi-generationally.
| dekhn wrote:
| I do not know of any true "image" data. Most complex patterns
| in nature are created by generative processes rather than
| direct encoding.
| stevezsa8 wrote:
| I guess it's possible if this conferred some survival
| advantage.
|
| It can be useful to work from the evidence to a conclusion
| instead of the other way round.
|
| But wondering and philosophising can be fun :]
|
| It would be cool if humans could pass knowledge via their
| offspring. But I always get worried thinking if I'm the
| asshole, I wouldn't want my kid to be one too.
| anigbrowl wrote:
| _I always get worried thinking if I 'm the asshole, I
| wouldn't want my kid to be one too._
|
| If you were, you would.
| yieldcrv wrote:
| Just has to not be a disadvantage.
|
| Plenty of mutations have no purpose whatsoever and were
| unrelated to survival or manifest after reproduction so are
| not selected for or against.
| dghughes wrote:
| Anton Petrov has a recent video on YouTube I never watched it
| yet but it's title is "Could Life Be Transmitted Via Radio
| Waves? Information Panspermia". Just a bit of fun I'm sure
| Anton isn't too wild he puts out some interesting videos but
| not in a way to push quackery.
|
| https://www.youtube.com/watch?v=K4Zghdqvxt4
| nico wrote:
| Fascinating, thank you
|
| Recently here on HN someone posted a quote saying something
| like "if you shine light at something for a long enough time,
| don't be surprised if you end up getting a plant"
|
| It was about how the environment seems to reorganize in
| certain ways to use up energy (the latest Veritasium video
| about entropy also talks about this)
| f6v wrote:
| I think it would have very high energy requirements. For this
| trait to survive over generations there would need to be a
| tremendous evolutionary benefit. What would that be for a
| "primitive image data"?
| yetihehe wrote:
| Maybe things like "long green shape" (cats' fear of cucumbers
| because they resemble snakes), or "a series of black and
| yellow stripes", or even "a black blob with many appendages"
| to watch out for spiders? Encoding some primitive image data
| so that further generations know what to avoid or pursue
| seems like a very tremendous evolutionary benefit.
| techdragon wrote:
| Yeah, I expect this isn't going to be how that sort of
| mechanism works, but it's always been an interesting
| concept for me, that while "genetic memory" as presented in
| much fiction is extremely unlikely just from the sheer
| entropic hill such mechanisms would have to evolutionarily
| climb to be able to pass on so much information (on top of
| the baseline necessary information for reproduction, the
| majority of memory won't on average confer a lot of
| reproductive advantages, so it's statistically more likely
| to get optimised out by the random mistakes of evolution,
| hence entropically "uphill") ...
|
| Yet while this fictional form is unlikely we have quite a
| lot of good examples and evidence for "inherited
| information". You have to be careful with it since it's too
| easy to accidentally include side channels for organisms to
| learn the information and thus break the test. Such as
| insects being genetically driven towards food by smell at a
| molecular chemical interaction level, and the smell
| becoming associated with the information you wish to test.
| A bee colony can't be reliably tested unless you raise it
| from a new queen in an odourless environment if you wish to
| see if bees genetically know that the shape of a flower is
| associated with food. It's tough to subtract the potential
| that a colony will have learned and "programmed" later
| generations of bees with things like the classic waggle
| dancing in order to more efficiently gather food.
|
| We do have good ones though like cats and snake shaped
| objects, it's surprisingly consistent, and pops up in some
| other animal species. It's wired into our brains a bit to
| watch out for such threats. There's a significant bias
| towards pareidolia in human brains and it's telling how
| deeply wired we have some of these things, but it is there
| and study shows it seems to form well before our cognitive
| abilities do... these all have some obvious reproductive
| advantages however so it makes sense that the "instinct"
| would be preserved over generations as it confers an
| advantage. But it's still impressive that it can encode
| moderately complex information like "looks like the face of
| my species" or "cylindrical looking objects on the ground
| might be dangerous"... even if it's encoded in a lossy
| subconscious instinctual level.
| TeMPOraL wrote:
| > _But it's still impressive that it can encode
| moderately complex information like "looks like the face
| of my species" or "cylindrical looking objects on the
| ground might be dangerous"... even if it's encoded in a
| lossy subconscious instinctual level._
|
| I think it helps that the encoding does not have to be
| transferable in any way. This kind of "memory" has no
| need for portability between individuals or species - it
| doesn't even need to be factored out as a thing in any
| meaningful sense. I.e. we may not be able to isolate
| where exactly the "snake-shaped object" bit of instinct
| is stored, and even if we could, copy-pasting it from a
| cat to a dog wouldn't likely lead the (offspring of the)
| latter to develop the same instinct. The instinct
| encoding has to only ever be compatible with one's direct
| offspring, which is a nearly-identical copy, and so the
| encoding can be optimized down to some minimum tweaks -
| instructions that wouldn't work in another species, or
| even if copy-pasted down couple generations of one's
| offspring.
|
| (In a way, it's similar to natural language, which
| rapidly (but not instantly) loses meaning with distance,
| both spatial/social and temporal.)
|
| In discussing this topic, one has to also remember the
| insight from "Reflections on Trusting Trust" - the
| data/behavior you're looking for may not even be in the
| source code. DNA, after all, isn't _universal, abstract
| descriptor of life_. It 's code executed by a complex
| machine that, as part of its function, copies itself
| along with the code. There is lots of "hidden"
| information capacity in organisms' reproduction
| machinery, being silently passed on and subject to
| evolutionary pressures as much as DNA itself is.
| techdragon wrote:
| Oh absolutely... and that's a great analogy for the more
| computer oriented, "Reflections on Trusting Trust"
| highlights how it can be the supporting infrastructure of
| replication that passes on the relevant information... a
| compiler attack like that is equivalent to things like
| epigenetic information transfer... and for fun bonus
| measure since it came to mind... the short story Coding
| Machines goes well for really helping to never forget the
| idea behind "Reflections on Trusting Trust"
| https://www.teamten.com/lawrence/writings/coding-
| machines/
|
| It definitely would be minimised data transfer, be it via
| an epigenetic nudge that just happens to work by sheer
| dumb luck because of some other existing mechanism or a
| sophisticated DNA driven growth of some very specific
| part of the mammalian connectome that we do not yet
| understand because we've barely got the full connectome
| maps of worms and insects, mammals are a mile away at the
| moment... no matter the mechanism evolution will have
| optimised it pretty heavily for simply information
| robustness reasons, fragile genetic/reproductive
| information transfer mistakes that work, break and get
| optimised out in favour of the more robust ones that
| don't break and more reliably pass on their advantage.
| f6v wrote:
| You need to compare that with an alternative solution where
| this information is learned by each generation and then
| asses the survival advantage of having it encoded in DNA.
| This is outside my field and I don't have a strong opinion.
| xk_id wrote:
| Once upon a time, the Wikipedia article about Hacking provided
| the following as sort of "canonical" example of hacking: using an
| optical mouse as barcode scanner. In some ways, this incredible
| paper feels like an iteration of that example.
| throwawaymaths wrote:
| Man what a confusing title. It's not a single strand of DNA that
| gets the image information. You get a pool of DNA, which
| collectively hold the images information.
|
| This is done in a pretty obvious way, each "pixel" is a well in a
| 96-well e.g. plate and you expose the bacteria in these wells to
| different light and then the DNA transformation is triggered by
| the light, then you harvest the DNA from the bacteria and get
| your image pool library.
| PaulHoule wrote:
| Would be neat if you could somehow splice them into a sequence
| but I think you'd have some alternating sequences that
| determine position and sequences that really code information.
| aclatuts wrote:
| Assassins Creed is becoming reality soon.
| throwawaymaths wrote:
| yes, I was very impressed by the title, but once I dug into
| this it was sort of like "well we could probably have
| accomplished this ~10 years ago when optogenetics first came
| out". Definitely a situation where branding the title got
| something "so silly that no one did it before" got it to be
| noticed.
| JumpCrisscross wrote:
| Something like this was posited in Banks' _Excession_ , and I
| remember thinking how advanced passing messages via DNA embedded
| in bacterial seemed.
| stevenwoo wrote:
| It's also a key plot point in Tchaikovsky's recent Children of
| ... trilogy.
| boffinAudio wrote:
| [flagged]
| muzani wrote:
| [flagged]
| hallihax wrote:
| This is great and all, but the merge process is still messy as
| hell
| [deleted]
| jvanderbot wrote:
| Can't wait until they do a worldwide investigation to find
| patient zero from the selfie encoded in the next pandemic.
| codetrotter wrote:
| > Can't wait until they do a worldwide investigation to find
| patient zero from the selfie encoded in the next pandemic.
|
| Then they realise the picture is this dude:
| https://blogs.loc.gov/loc/2022/07/robert-cornelius-and-the-f...
| asimpletune wrote:
| Something that I always found fascinating is how DNA is a base 4
| information format. There's this thing called radix economy,
| which is basically an expression of how efficient a number system
| is. Base e is the theoretical maximum, and so base 3 is the
| closest integer.
|
| Obviously if you have a special use case, then that may dominate
| your radix economy (like hex, b64, etc...), but for general
| purpose information purposes, the order base 3, base 4, then base
| 2.
|
| This present a lot of interesting questions to me. Like, why
| didn't DNA end up as base 3? (probably because 4 naturally lends
| itself to pairs of 2).
|
| Also, this idea of radix economy goes beyond just the encoding of
| information and is represented in logical economy as well. So for
| example, ternary logic is (much) more efficient than binary
| logic. Having that 3rd state just makes problem solving much more
| elegant.
|
| To that end, I have always wondered how nature has exploited this
| 4-state number system logically. Like, are there all sorts of
| exotic logic gates that come from a 4 state system?
| dahfizz wrote:
| Why do my eyelashes, meant to protect my eyes, fall into my
| eyes? Why do my cheeks/tongue sometimes get in the way of my
| teeth so that I bite them? And why do they then get inflamed so
| that I continually byte them for the next few days?
|
| We are all a bunch of biological goop resulting from random
| processes. Don't expect optimal solutions from evolution. There
| is no "why".
| lurknot wrote:
| https://en.wikipedia.org/wiki/Chargaff%27s_rules
| dylan604 wrote:
| >So for example, ternary logic is (much) more efficient than
| binary logic. Having that 3rd state just makes problem solving
| much more elegant.
|
| Binary for electronics is obvious because there are 2 states in
| electric components: on or off. There is no 3rd option.
| nicoburns wrote:
| I believe "on" and "off" in electronics typically correspond
| to different voltage levels. So you absolutely could have a
| third intermediate state if you wanted to. Flash memory does
| this (and even sometimes has 4 states). I guess designing
| switches (transistors) that could take advantage of and
| propagate these extra states could be tricky though.
| ohwellhere wrote:
| https://en.wikipedia.org/wiki/Ternary_computer
| guerrilla wrote:
| > (probably because 4 naturally lends itself to pairs of 2).
|
| Why would pairs of two be favorable?
|
| > So for example, ternary logic is (much) more efficient than
| binary logic. Having that 3rd state just makes problem solving
| much more elegant.
|
| What do you have in mind here?
|
| > Like, are there all sorts of exotic logic gates that come
| from a 4 state system?
|
| I don't know but you may be interested in this [1].
|
| 1. https://en.wikipedia.org/wiki/Catu%E1%B9%A3ko%E1%B9%ADi
| PaulHoule wrote:
| DNA has the same limitation that many serial protocols have: if
| you repeat the same base pairs (e.g. "AAAAAAAAAAAAAAAAAAAAAA")
| you will have trouble w/ the DNA not spiraling correctly. Some
| sequences of 2-6 repeated base pairs seem to "deliberately"
| cause variant behavior in DNA and RNA, see
|
| https://en.wikipedia.org/wiki/Repeated_sequence_(DNA)
|
| Many real wire protocols have mechanisms to prevent repeated
| sequences entirely
|
| https://en.wikipedia.org/wiki/8b/10b_encoding
|
| DNA coding for real proteins is unlikely to be too terribly
| repetitive but I image a long a helix could have a repetitive
| amino acid sequence. Many amino acids can be coded with variant
| codons, I guess if repetition were a problem in a particular
| gene natural selection could step in.
| [deleted]
| jszymborski wrote:
| There's a lot of interesting things to consider.
|
| One, is that base 4 makes a lot of sense for the stability of
| DNA structures. You have two purines, two pyrimadines.
|
| Another is that partly because codons are degenerate, the
| distribution is way off a uniform distribution. For chemistry
| and mol bio reasons, the distribution of AGTC is very skewed.
|
| When i fully wake up, this might be a fun blog post to draft.
| Llamamoe wrote:
| Radix is important for digit-efficiency, but in a biological
| system that's not necessarily related to molecule size
| efficiency.
| pythonguython wrote:
| I'm also failing to see how digit efficiency would be
| important in DNA. In fact, it seems that a high base system
| would be more efficient. If you had 80 nucleobases instead of
| 4, each base pair would contain far more information
| asimpletune wrote:
| The efficiency comes from the ratio of the alphabet to the
| number of character places needed to express them.
| Otherwise why not base a million? Or a billion?
|
| This ratio is what leads to base e being the theoretical
| maximum.
| JumpCrisscross wrote:
| > _If you had 80 nucleobases instead of 4, each base pair
| would contain far more information_
|
| Which is a problem given DNA is a lossy format.
| toufka wrote:
| You (at least) have 3 systems that are optimized in concert in
| a (our) DNA/Protein world.
|
| DNA base set, Amino acid set, Translation layer between
| DNA/Proteins.
|
| Currently, we've got: 4 DNA bases, 3 bases/AA, 20 AAs; 4^3 =>
| 20
|
| If you change one of those numbers, you'll need to rejigger the
| rest, and you'd need to reoptimize. And there are competing
| goals which at least include: - maximize access to
| biophysical/chemical diversity - minimize energy expenditure to
| produce each component, chemically - minimize energy
| expenditure to both copy instructions & produce products -
| maximize information fidelity - minimize or at least degrade
| gracefully in the context of errors
|
| In the context of a 3-base system, you very well could throw
| off those optimizations given the consequences for the other 2
| parameters (#AA & nt/AA). 3^3 = 27, which is very close to the
| maximum of 20 amino acids. Which means you'd probably need a
| 4nt->AA translation layer to keep the same number of AAs, and
| that alone would add 30% more energy expenditure. If you kept
| the 3nt->AA system you'd BOTH need to reduce the number of
| accessible amino acids AND you'd lose some of the error
| correction mechanisms of having degenerate codons code for the
| same amino acid.
| icoder wrote:
| DNA is not really processed like that, afaik. Mostly, each 3
| bases code for an amino acid, which are glued together to a
| string (protein), which folds in a 3D structure based on the
| characteristics of all amino acids.
|
| Some DNA is used to attract other proteins, or even interact
| with DNA elsewhere on the strand, or is translated to RNA (one-
| on-one) which can then have a function based on its sequence or
| the structure it folds into.
|
| Any 'logic' there is, is built _on top_ of this.
| jacquesm wrote:
| Any logic that we are currently aware of. DNA contains many
| unsolved mysteries and I expect that gift to keep on giving
| for a long time to come.
| go_elmo wrote:
| Might error-correction play a role? Having a lightly
| inefficient base 4 system might provide capacity for the
| surplus error correcting code information capacity?
| Faaak wrote:
| One mutation in a base pair can lead a totally different
| amino acid (c.f the genetic code), so I doubt it ?
| toufka wrote:
| BUT, if you look at the codon table, precisely because it's
| base-4 and not base-3, many base flips are silent when
| coded.
|
| By using base-4, there's enough space to permit lossiness
| of the coding itself - given the number of amino acids and
| the 3-NT encoding.
|
| So you really aren't optimizing JUST for nucleotide
| encoding, but you're also optimizing in concert with
| 3-nt/AA, and 20AA codes.
|
| So if you have to optimize for information density and
| fidelity, given X-nucleotides, Y nucleotides/AA, and Z AAs,
| and sample as much chemical and physical diversity in those
| AAs life has settled upon: X=4, Y=3, Z=20.
|
| If we went with X=3, you might need Y=4 to get the same
| kind of fidelity, but that cranks up your energy costs by
| 30% (from 3 to 4 NT per AA).
| go_elmo wrote:
| True, iff the error persists correction cycles which are
| present, how exactly they work / if theyre comparable with
| eccs I dont know.
| icoder wrote:
| DNA mostly relies on the fact that there's 2 strands that are
| (logically speaking) a mirror copy of each other (a C is
| paired with a G and vice versa, an A to a T and vice versa),
| it's like RAID 3 with only 2 disks (one being parity).
|
| Apart from repairing structural damage such as missing bonds,
| the cell can even repair missing bases or non-straight breaks
| without loss. This mechanism is also used for replication:
| the entire strand is split and each half is completed with
| its mirror counterpart.
| go_elmo wrote:
| Im aware of that, but was rather thinking about ECCs like
| hamming-code, that are able to correct single sequences of
| info based on surplus info in that same string.
| dekhn wrote:
| Nothing algorithmically sophisticated, but DNA repair
| enzymes already do this.
| usrbinbash wrote:
| > Like, why didn't DNA end up as base 3?
|
| Why did we end up with only 20 proteinogenic amino acids? Why
| are vertebrate neural architectures inverted (cell bodies on
| the inside, connections on the outside, even though the other
| way round way (eg. like a squids brain is organised) is easier
| and less inhibitive to growth?
|
| 2 Reasons:
|
| a) Because nature and evolution cannot engineer. Random
| mutation, recombination and natural selection are the only
| mechanisms available. Things get selected if they outcompete
| existing alternatives, they don't need to be the best
| solutions.
|
| b) All solutions have to be built by modifying what already
| exists. Evolution doesn't get to do greenfield projects,
| because anything that has to start from scratch is so
| disadvantaged in natural selection compared to already evolved
| complex life, it will fail.
|
| This leads to systems that, from an engineering point of view,
| don't always make a lot of sense.
|
| Eg. the architecture of the vertebrate neural system creates a
| lot of issues (eg. our light sensitive cells point in the wrong
| direction). The only way this makes any sense if when one looks
| at how the neural tube (the precursor to the backbone) is
| formed by the endodermis folding in on itself. This process is
| so deeply at the root of the Chordata, and so many other things
| depend on it, that it simply cannot change any more.
|
| Many many biological systems are "legacy systems" in the truest
| sense of the word: Solutions produced a long time ago that may
| have many problems, but are simply too deeply enmeshed with
| everything that came after, that they are now impossible to
| change.
| function_seven wrote:
| > _(eg. our light sensitive cells point in the wrong
| direction)_
|
| Can you expand on that? Are you talking about front-facing
| eyes vs. birds' eyes? Or something else like retinal
| structure?
| dtgriscom wrote:
| Retinal structure:
|
| https://en.wikipedia.org/wiki/Retina#Inverted_versus_non-
| inv...
| metabagel wrote:
| I had to look this up, and I guess what usrbinbash was
| referring to was the layout of the retina, which places the
| rods and cones behind layers of transparent neurons.
|
| https://en.wikipedia.org/wiki/Retina#/media/File:Retina-
| diag...
|
| Edit: ninja'd
| dekhn wrote:
| Yet, it doesn't really have a strong impact as it's been
| determined that humans can see individual photons and we
| aren't dependant on night vision for hunting.
| t_serpico wrote:
| A classic armchair response. DNA has complementary
| nucleotides (AT,GC) that facilitates its pairing. Base 3
| wouldn't work in that sense. Also, you can't forget about the
| genetic code. See https://arxiv.org/pdf/q-bio/0605036.pdf for
| interesting thoughts. Remember, evolutionary biology is a
| field and people think about these questions!
| usrbinbash wrote:
| > Base 3 wouldn't work in that sense.
|
| That's true, but a) not the point I am making, and b) I am
| pretty sure it says nowhere in my post that it would.
| idiotsecant wrote:
| This is pretty smug for someone who seems to have managed
| to miss the point entirely. Yes, DNA has certain features
| that require a base 4 system. That is not necessarily true
| of all possible systems with DNA-equivalent function, which
| is the point this whole thread is making.
| hotstickyballs wrote:
| If you iron man the argument then it's an error
| correction argument in that this simple ecc method can be
| what favours a base-4 encoding instead
| t_serpico wrote:
| How have I missed the point? The answer that nature
| cannot engineer and can't start de novo are trivially
| true statements that provide no actual insight into the
| question. I fully agree the original question itself is a
| deep one. A quick literature search is more productive
| than pontificating with weak analogies. See https://www.m
| ath.unl.edu/~bdeng1/Papers/DengDNAreplication.p... for
| what seems to be an interesting analysis regarding base
| number and DNA replication rate.
| usrbinbash wrote:
| > that provide no actual insight into the question
|
| Mind elaborating on that?
|
| Because there is no biochemical reason why DNA could not
| have incorporated, say, a third pairing pair, so while
| base-3 (which I don't specifically mention in my post
| btw.) wouldn't work, base 6 or 8 would have been
| possible. "Unnatural Base Pairs" are even known to work
| in laboratory settings.
|
| There is also no biochemical reason why base2 life
| wouldn't work. Expand the reading frame of the
| translation machinery to 5 instead of three, and you have
| enough coding space for polypeptides.
|
| My answer adresses the question completely, because the
| only reason behind these "decisions" is an ancient system
| that simply got "frozen", and now cannot change any more.
| sterlind wrote:
| _> There is also no biochemical reason why base2 life
| wouldn 't work._
|
| are you sure about that? are you sure there's no weird
| effects that might destabilize very long sequences of
| 2-nucleotide DNA? or on how wide DNA-binding domains have
| to be to cope with reduced information density, and how
| that might sterically hinder smaller arrangements of
| proteins?
|
| _> My answer adresses the question completely, because
| the only reason behind these "decisions" is an ancient
| system that simply got "frozen", and now cannot change
| any more._
|
| your answer is just a hypothesis, not a proof. these
| things can be studied (by studying abiogenesis in-vitro),
| and it's not certain these decisions were "flash frozen"
| like you describe. 2-, 4-, and 6- nucleotide coding
| systems might have coexisted in the RNA world, and 4-
| could have won out for some reason.
| sparrowInHand wrote:
| Short answer: Likelihood of noise (brownian motion) producing
| the element and keeping it interacting. Then once it gets
| going, likelihood of keeping state, while interacting.
| throwawaymaths wrote:
| It's probably not base four because you have to stretch out
| more pairs to match up four pairs and that's entropically
| disfavored. However ribosomes can accomodate a four pair
| matching, though at a very reduced yield (unless you think
| Schultz's postdoc fabricated those data)
| asdff wrote:
| On paper this might be an interesting game, but you have to
| think of things in terms of crystal structure, what is able to
| form hydrogen bonds, what ends up being sterically hindered and
| what that means for the molecule. This is why watson and crick
| and franklin's work was so seminal, it showed how genetic
| information was inherited through mechanical logic of these
| molecules alone. Before the structure of DNA was solved, there
| were a lot of competing theories over what molecule was the
| source of heritable information, and how this information was
| exactly passed down between generations.
| p0w3n3d wrote:
| Exploit idea: create an image, which, when taken a shot of, would
| be written to DNA as a virus.
|
| (I know viruses are RNA)
| pyinstallwoes wrote:
| Do a Quine now!
| luckystarr wrote:
| And create a never to be deleted record of images across the
| infected population?
| dormento wrote:
| Cue "tasteless porn in bitcoin blockchain forever".
| TeMPOraL wrote:
| Luckily, DNA is mutable over generations, so all such noise
| can and will eventually be filtered out.
| dillydogg wrote:
| There are plenty of DNA viruses. They aren't limited to RNA at
| all
| swamp40 wrote:
| I've always wondered if the plant/animal shapes and sizes were
| represented literally in a 3D mapping of the DNA. Like we could
| already have a picture of what it will become, if we could just
| decode the DNA sequences properly.
|
| Like we have the sequence of numbers for a jpg, but we've never
| seen the picture.
| K0balt wrote:
| Am I alone in thinking that using regular DNA is a terrible idea
| for data storage?
|
| I mean, that would make your storage medium a potential
| biohazard. Although it probably would all be cool until someone
| put smallpox.bin on a major torrent tracker.
|
| If it's really that good, we should come up with a variant using
| slightly different chemistry so that biocontamination is not a
| factor.
| throwawaymaths wrote:
| > that would make your storage medium a potential biohazard
|
| generally no. If you're worried about random DNA being a
| biohazard there are way worse things to worry about, like how
| your immune system uses random stretches of biologically primed
| dna to create antibody diversity.
|
| The real reasons why it's terrible is that write speed is
| atrocious and read speed is bad (on the order of 2-3x that of
| amazon glacier's robotic tape handlers, with WAY MORE expensive
| robots, and way more expensive cost to read -- you're bulk
| polluting rivers in china to make the reagents).
|
| The only use case I can think of is deep generational archival
| (like the svalbard seed bank, but for information). Where cost
| to store by volume is at a premium, and where you'd like to
| have many many many copies, and you don't mind the cost to read
| because you won't be reading it but for every 10 or so years,
| if even.
|
| Store your logs in DNA. You're never going to read them
| anyways.
| asdff wrote:
| Having DNA as a storage medium is the best way to store
| actual biological data. Currently, we do things like having
| seedbanks, which need periodic replacing as seeds grow to be
| nonviable. A library of genomes is a much smaller physical
| footprint than a seedbank. It doesn't need periodic
| replacing, provided its not getting bombarded by radiation or
| anything unusual like that. DNA doesn't even have to be
| stored frozen; you can freeze dry it and store it at room
| temp for a very long time before any significant degradation.
| You can also just have the sequence stored digitally, and
| synthetically build out the dna molecule as you need it (I
| think this is still pretty costly though and not that
| efficient). With the right molecular biological tooling, one
| could conceivably introduce these genomes into a plant cell
| line and grow them up in tissue culture, you don't have to
| for example grow a tree and let it mature and go to seed
| since plant cells are pluripotent, everything can be done in
| a lab much faster.
| throwawaymaths wrote:
| > Having DNA as a storage medium is the best way to store
| actual biological data
|
| 100% agreed. I thought that was obvious, was mostly sniping
| at "DNA based digitalstorage startups", thanks for
| clarifying for me
| fartsucker69 wrote:
| imagine downloading a tv show and it turns out its shit
| justsocrateasin wrote:
| Reminiscent of a very interesting company I interviewed for last
| year called Cache DNA
|
| https://www.cache-dna.com/
|
| This is the future. I don't think it will look exactly like this,
| and I don't think it will be here any time soon, but I'm excited
| to see these advancements.
|
| What Cache is doing presently is trying to do archival storage in
| DNA - it has a lot of potential to be cheaper, more energy
| efficient, and more redundant. But some of the processes still
| aren't there yet.
| ray__ wrote:
| Even just storing family photos would require DNA sequences
| that are orders of magnitude larger than the human genome, so
| you're going to be looking at very expensive or very time
| consuming read/write (and certainly no instant read write at
| any cost-the turn around time can't be less than hours, even
| for small files, even with high-end HTS or nanopore approaches
| afaik). What is the plan for getting around this?
| LearningToWalk wrote:
| I've been up close to one project working in this space. The
| obstacles are obviously many, but fascinating to see that
| progress is made nontheless. Clearly a piece of the-future-
| puzzle.
| js8 wrote:
| It reminds me of children story
| https://en.m.wikipedia.org/wiki/The_Mystery_of_the_Third_Pla...,
| which had flowers that captured the surroundings in layers, like
| a film camera.
| stainablesteel wrote:
| this is like spy technology, super cool
| OscarTheGrinch wrote:
| Man, I have a hard enough time trying to keep track of a micro SD
| card, imagine misplacing your DNA based files?
|
| Seriously tho, using DNA as an information storage medium is a
| pretty neat concept.
| Borrible wrote:
| >using DNA as an information storage medium is a pretty neat
| concept
|
| And billions of years old.
| DeathArrow wrote:
| > And billions of years old.
|
| With not quite good backup strategies.
| dylan604 wrote:
| Could make for some interesting decoding errors as your
| original data mutates
| asdff wrote:
| Having an error rate means a small chance of gaining an
| edge that makes up for having it.
| Borrible wrote:
| DNA/RNA looks to be more like storing heuristics, landmarks
| and clues, not data.
| snitty wrote:
| Can you elaborate on that? A significant portion of DNA
| in organisms literally encodes for protein sequences. It
| also has functional parts (binding sites for proteins,
| promoter sequences). Some RNAs are not translated because
| the RNA itself has function, but I don't see that same
| argument for DNA.
| asdff wrote:
| Only like 1.5% of the human genome is protein coding.
| Scarblac wrote:
| If you create your data right, _the actual data_ can make
| backups of itself. There 's even builtin ways for it to
| improve itself over time using genetic algorithms.
| TeMPOraL wrote:
| A kind of implied meaning of the term "data", especially
| in context of storage and archiving, is that we _do not_
| want it to "improve itself".
| urfullofsht wrote:
| [dead]
| Daub wrote:
| Its not often that HN makes me laugh.
| jdsalaro wrote:
| You could even say we're talking about a legacy storage
| medium ;)
| jacquesm wrote:
| Ok, who ate the family movie archive?
| candiodari wrote:
| Just so we're clear, this is ONE pixel per, I don't know, 10000
| cells or so. So one bit per DNA chain, with that bit repeated
| thousands of times to get redundancy. Still and incredible
| achievement.
| fxtentacle wrote:
| The super neat thing is that they tag each DNA chain with the
| pixel coordinates, so you can afterwards mix those 10,000 DNA
| strands each for all 96 pixels into one 1-mio-DNA-strand-soup
| and still recover the image successfully.
| jojobas wrote:
| >successfully
|
| Except when it recombines weirdly and gets mixed up as per
| the article.
| tough wrote:
| Those are just mutations in the image
| TeMPOraL wrote:
| "Hallucinations" seems to be the modern term.
|
| /s, but only slightly.
| usrbinbash wrote:
| > DNA synthesis remains a bottleneck in the adoption of DNA as a
| data storage medium.
|
| Yes, one of many.
|
| Another one is a simple question: _What exactly is the use case
| again?_ Because, storage isn 't something we lack. Especially
| when talking about storage where, obviously, fast random access
| isn't a requirement, aka. data archiving.
|
| We have good solutions for that; an LTO-9 tape can hold 18TiB of
| data native and up to 45 TiB of data compressed, with denser
| capacities planned: https://en.wikipedia.org/wiki/Linear_Tape-
| Open
| pcrh wrote:
| Encode wikipedia into DNA, then insert it into a horseshoe
| crab. In a few million years it may still be around to be
| decoded.
|
| >The fossil record of Xiphosura goes back over 440 million
| years to the Ordovician period, with the oldest representatives
| of the modern family Limulidae dating to approximately 250
| million years ago during the Early Triassic. As such, the
| extant forms have been described as "living fossils".[9]
| https://en.wikipedia.org/wiki/Horseshoe_crab
| Aardwolf wrote:
| Some number I found online, while trying to multiply the 30
| trillion human cells with the data storage of DNA per cell:
|
| "one gram of dried DNA can store 455 exabytes of data"
|
| Seems like a pretty sweet use case to me!
|
| I definitely do lack storage by the way. Say I want to download
| the common crawl data set, 380 TiB. And for redundancy I'd need
| multiple copies of the data too. That's a lot of disks for in
| the home. "18TiB ought to be enough for everyone" really
| doensn't cut it.
| usrbinbash wrote:
| > one gram of dried DNA can store 455 exabytes of data
|
| Yes, and half a gram of Hydrogen could produce ~500 Megawatts
| of power in a fusion reactor. However, that theoretical value
| will remain irrelevant, as long as we cannot build a
| practically useful fusion reactor. And even if we could build
| one, it still has to compete with all other forms of
| producing power for scalability, reliability, efficiency and
| cost.
|
| The fact that there is a very high theoretical number that
| seems really impressive, isn't a use case.
|
| So, with that being said: how long does it take to write
| these 455EiB? How long does it take to read them? How error
| prone are both processes? And how much does it cost to
| write/read them?
|
| > "18TiB ought to be enough for everyone" really doensn't cut
| it.
|
| Pretty sure I never said that.
|
| Also pretty sure common crawl can be compressed. Even
| assuming only a 2:1 compression rate, that means it fits
| comfortably on 11 LTO-9's. Now, a quick google-search churned
| out tape prices of about 110-140 $ per LTO-9. Let's say ~150$
| per tape, that means the whole thing fit's on 1650 $ worth of
| storage. About 5000 bucks with 2 backups included. Double
| that for uncompressed storage.
|
| Alright, so how does that compare to DNA storage?
|
| https://www.nanalyze.com/2023/03/dna-data-storage-solution/
|
| quote:
|
| _These days, it costs $600 to sequence a complete genome
| which contains around 200 gigabytes of data or about $3 per
| gig. Today, magnetic tape technology offers the lowest
| purchase price of raw storage capacity at around two cents
| per gigabyte_
|
| end quote.
|
| So just _reading_ the 380 TiB back from uncompressed storage
| _ONCE_ , would cost ~1,140,000 dollars.
|
| And that's just for reading. At a price differential that is
| measured in multiple orders of magnitude, a technology better
| offer some REALLY good, REALLY tangible advantages to
| compete.
| Aardwolf wrote:
| I of course wouldn't want to store my data in there today,
| I wouldn't even trust that I get it back reliably because
| DNA reading comes with a relatively big error rate for
| storage purposes (of course error correction can mitigate
| that). But it would be cool if the technology progresses.
| All technology, including disks, magnetic tapes, and new
| alternatives. Whether DNA is viable in the end or not, I
| don't know. I do know that tech always has been progressing
| and new alternatives are sometimes found, and that I do see
| a use for more storage.
|
| But an argument whether DNA is a viable option in the
| future or not would have to say technically what the issue
| of DNA is with future tech.
|
| Whether it's more expensive today, or that there's no need
| for more data today, are not really arguments against it.
|
| I do not intend to be arguing for snake oil or anything
| here though. If "DNA storage" is in a similar category of
| "perpetual motion machines" and "cars that run on tap
| water" then count me out.
|
| I don't even know how our comments ended up being like
| arguing against each other. The only thing really I didn't
| agree with in the original comment was "Because, storage
| isn't something we lack", because I do find it lacking,
| both at home and in the cloud.
| speed_spread wrote:
| Let's take a moment to appreciate how the classic "bandwidth of
| station wagon filled with tapes" scales with tape technology.
| Too bad there aren't many station wagon choices nowadays but I
| guess any minivan would do in a cinch.
| piyh wrote:
| Can your tapes self replicate? /s
|
| The goal of replacing memory cards is dumb, the tech that
| enables the storage is a foundational step forward in bio
| engineering.
___________________________________________________________________
(page generated 2023-07-10 23:01 UTC)