[HN Gopher] Sequencing your DNA with a USB dongle and open sourc...
___________________________________________________________________
Sequencing your DNA with a USB dongle and open source code
Author : TangerineDream
Score : 194 points
Date : 2021-02-03 15:25 UTC (7 hours ago)
(HTM) web link (stackoverflow.blog)
(TXT) w3m dump (stackoverflow.blog)
| yters wrote:
| How do we get a dongle?
| jmiskovic wrote:
| The source README mentiones supporting MinION ($1000)
| https://nanoporetech.com/products/minion
|
| Nice video here https://www.youtube.com/watch?v=1_mER5qmaVk
|
| Found some previous discusion of HW:
|
| https://news.ycombinator.com/item?id=16262719
|
| https://news.ycombinator.com/item?id=7893158
| yters wrote:
| Incredible, thanks! Cheaper than a shotgun sequencer :)
| danpalmer wrote:
| Bear in mind that it'll only work once. Additional
| consumable flow cells are a similar price.
| snypher wrote:
| I think it's misleading for the article to call this a dongle.
| It's a $1k USB device with expensive consumables, more like a
| printer.
| [deleted]
| kneel wrote:
| Nanopores have unacceptably high error rates. Around 10%
| sannee wrote:
| Is this an accuracy or precision issue? I am imagining that if
| you actually have access to the device, you could do as many
| runs as you want, getting to arbitrarily low error rates.
| brofallon wrote:
| This is a common misconception - "averaging out" errors only
| works if the errors are pretty rare at any given site. This
| is true for some types of errors & sequencing technologies,
| but not universally true. Some types of DNA sequences (most
| notably homopolymers and other simple repeats) are very
| difficult to sequence correctly, and X% of the reads there
| will be incorrect. If X>20% of so, then it may look like real
| germline variation no matter how many reads are sequenced
| koeng wrote:
| The errors are non-random. That's why they use machine
| learning to figure out those errors. You could, of course,
| also just do traditional statistics on sequences that you
| want to sequence all the time. I've done that with plasmids
| before, and it works pretty good. I think there are a few
| papers on it too.
| searune wrote:
| > The errors are non-random.
|
| Could you elaborate / give an example? Are the errors
| deterministic? Is it like ISI (Inter-Symbol
| Interference[1]) in signal processing, where some symbols
| interfere with the reception of the next symbol(s)? Are
| there short range errors (one letter) or long continuous
| errors?
|
| [1] https://en.wikipedia.org/wiki/Intersymbol_interference
| marsdentech wrote:
| It's a complicated issue; I tend to think of the error
| component of any one MinION observation as being a
| function of the k-mer in the pore at the time (i.e. the
| subject of the observation) and, with some decaying
| dependence, the sequences (i.e. in both directions) that
| extend out from either side of the target k-mer. You
| might say that MinION error is a function of the target
| k-mer and its immediate environment. It gets even messier
| when you try to imagine the form of that function; for
| one, it's not _completely_ good enough to remain in
| sequence space alone: among other things, the "shape"
| (i.e. the conformation) of that (DNA or RNA) molecule
| around the target k-mer will influence how the shape of
| the pore will change in response to the target k-mer,
| which, in turn, will influence the observed current
| signal (i.e. manifest as a deviation from the "expected"
| or "ideal" current signal for that k-mer!). As I
| understand it, Nanopore don't spend too much time
| actually modelling k-mer-in-pore dwell-mechanics; instead
| their best base callers use machine learning to
| generalise across the swathes of available sequencing
| data for known targets (and give really quite impressive
| results, all things considered).
| koeng wrote:
| https://gist.github.com/Koeng101/abc674e1acd575646748afcb
| cc7...
|
| There is a real example I ran a few months ago. How to
| read it is here
| https://en.m.wikipedia.org/wiki/Pileup_format
|
| Positions like 172 have errors more often than not
| because the basecaller is wrong sometimes (note: this is
| from a sequence verified sample).
|
| The errors come up more often in some sequences than they
| do in others. I'm not really sure about symbol
| processing, but if you have any beginner resources for
| that I'd appreciate them!
| dnautics wrote:
| don't know why this was downvoted. If I'm not mistaken, there
| is generally a high error rate per pore fundamentally because
| it's a single molecule experiment. These get averaged out,
| but may be difficult to align as it might not necessarily be
| a straightforward averaging. There are also segments that are
| fundamentally generally difficult to sequence correctly
| (single nucleide runs, not even a super high n) that will
| probably never get satisfyingly resolved no matter how many
| times you sequence.
| searine wrote:
| It should be noted that the "errors" in this case are gaps in
| sequence. Sometimes the DNA strand slips through the pore and
| some bases aren't called.
|
| The actual base calling is on par with Hi-seq in my experience.
| In software terms, you are missing chunks of code, but aren't
| flipping bits.
|
| This is important because in certain experiments, you care less
| about those gaps (scaffolding for example). So you can get a
| lot of cheap utility out of nanopore sequencing.
| chrisamiller wrote:
| That all depends what you want to do with the data. For
| assembling new genomes, they produce very long reads that are
| essential for "scaffolding". They're also great for structural
| variant detection (large rearrangements of DNA). DNA sequencing
| is not a monolith and there's room for lots of different
| complimentary technologies.
| koeng wrote:
| Are you sure about that? My last consensus run worked with
| complete coverage of ~410 bp region. Here is a gist of the raw
| pileup without consensus -
| https://gist.github.com/Koeng101/abc674e1acd575646748afcbcc7...
|
| Visually, I think, you can see that it isn't THAT bad (low
| coverage at the ends is because of how I barcoded the
| sequences).
|
| I hate to be that guy, but have you actually used the
| technology? And if so, approximately what year? Unacceptable
| for what procedure? Do you have any raw reads that have been
| troubling you?
| searine wrote:
| They mean at genome-wide scales. If you are just doing a
| 410bp the sequence is short enough that the signal of is
| going crush and noise you get from strands slipping in the
| pores.
|
| The errors nanopores get are gaps, not base pair
| substitutions. So with things like viral or bacterial
| sequencing you don't really have huge issues.
|
| When you are doing large eukaryotic sequences with lower
| coverage on average, you start picking up a lot of deletion
| artifacts. Which isn't a huge deal if you have a very well
| annotated genome like human, but if you are doing pioneer
| genomics it can create some difficulties. Often if the genome
| isn't well annotated, its best to pair nanopore with short
| reads.
| koeng wrote:
| The gaps are usually homopolymers and such, which should
| get helped by R10 pores. But true, at low coverage, things
| can get tougher!
| marsdentech wrote:
| This is a common, and often justified, though not always fair,
| criticism. MinIONs have an error rate of around 10% for _any
| given base_. Moreover, these errors aren't entirely independent
| of one another, so if you struggle to sequence a given base the
| first time, you're likely also to struggle if you try again.
| That said, if your experiment is such that you're only
| sequencing a guaranteed single target (e.g. one, isolated
| coronavirus genome), in that one sequencing run (on that one
| flow cell), you'll "re-sequence" the same any given region many
| times and, unless you're looking at "problematic" (i.e. low-
| complexity) regions, you _will_ be able to "average out" the
| errors to reveal the true target sequence. On the other hand,
| if you're trying to co-sequence a mixture of closely-related
| targets, that's when the headache starts...
| samchorlton wrote:
| So happy to see this here. While sequencing is quite old, mass
| adoption still has not come. The benefits are clear - faster
| infectious disease diagnosis, personalized treatment, tracking
| the spread of infection, identifying food contamination - the
| use-cases are endless. However before nanopore sequencing came,
| it was always out of reach of the masses.
|
| We've actually started BugSeq[0] to help labs get into nanopore
| sequencing - improving these open source tools and also writing
| our own. Orgs like FDA, USDA, big food co's, CDC, etc are now all
| adopting nanopore sequencing. Happy to see the industry taking
| off, this will be a step function improvement for public health
| in general.
|
| (disclaimer: founder of BugSeq) 0: https://bugseq.com
| dekhn wrote:
| personalized treatment is still best handled by gene panels.
| nobody has made a compelling argument for WGS for personalized
| med. Right now it's a huge waste of investment until we
| understand the multigenicity of diseases better (which is a
| research problem best solved by sequencing millions of
| individuals and using high quality WGS sequencers).
| nextos wrote:
| I think typing your HLA class I and II genes is the single
| most valuable thing you can get now from your genome. It's
| also pretty likely to remain extraordinarily valuable even if
| whole-genome sequencing prices drop to nearly zero.
|
| HLA associations with autoimmune disorders are
| extraordinarily strong. Same applies to infectious diseases,
| vaccine efficiency and checkpoint inhibitor efficiency.
|
| While you can type HLA with classical techniques, the only
| really reliable way is really to use long reads.
|
| Same applies to CYP enzyme superfamily, where variation is
| linked to some rare drug toxicity events for example.
|
| We should all know our HLA and our CYP genotypes. Why 23andme
| does not even attempt to impute HLA is beyond my
| understanding.
| teekert wrote:
| One example: Homologous Recombination Deficiency, the
| signature it leaves genome-wide and the associated
| sensitivity to PARP inhibitors.
|
| But agreed, it is about time we start to understand
| regulatory regions better. But that will require gathering
| more WGS data, and indeed most data is Whole Exome or Panel.
| dekhn wrote:
| Research project, not actionable human health. I fully
| support large-scale WGS projects and hope that some day one
| of them will have a recognizable impact.
| samchorlton wrote:
| I don't know about this specific example, but DNA
| sequencing is already routinely used for personalized
| oncology therapeutics outside of clinical trials, so not
| really research project.
|
| Source: Am MD and practice laboratory medicine.
| dekhn wrote:
| Sure. Doctors love to try new technologies. most of the
| reports of success are happy narratives, not evidence
| based medicine.
| samchorlton wrote:
| We work within the infectious disease space, so I'll give an
| example from our work that is still personalized medicine:
| Faster detection of antimicrobial resistance. Every infection
| will be resistant to different
| antibacterials/antivirals/antifungals/antiparasitics. What if
| we could get the patient on the right antimicrobial for their
| specific infection faster? There's strong evidence that
| timely administration of correct antimicrobials in septic
| shock results in improved mortality.
|
| Nanopore sequencing very much has the potential to deliver
| this personalized treatment, without looking at any human
| genes or panels. If we could rapidly sequence bacteria in the
| bloodstream and predict their antimicrobial susceptibilities,
| we can make a difference.
| dekhn wrote:
| What you're describing is a very reasonable research topic
| with some supporting evidence.
|
| What I'm saying is that nobody has delivered on any of the
| huge claims about the genome which genomicists made for the
| last 20 years, specifically in terms of actionable human
| health.
|
| it's time to start calling the bluff.
| samchorlton wrote:
| I'm not exactly sure how you can say that.
|
| The following have been revolutionized by the human
| genome project and subsequent technological innovation in
| sequencing:
|
| -Non-invasive prenatal diagnostics
|
| -Screening for cancer with cell-free DNA
|
| -Rapid and accurate diagnostics for children with
| suspected genetic disorders
|
| -Targeted cancer therapeutics
|
| Many of these are already in routine clinical use in high
| income countries and result in significant improvement in
| human health.
| [deleted]
| dekhn wrote:
| The impact is minor and most of the progress did NOT come
| from HGP data.
|
| I worked in genomics for 20 years. I have deep knowledge
| of biology and medicine. And the reality is, for the
| amount of money invested, the actionable medical returns
| have been relatively tiny and industry continues to not
| invest in sequencers for a good reason.
| searine wrote:
| >What I'm saying is that nobody has delivered on any of
| the huge claims about the genome which genomicists made
| for the last 20 years, specifically in terms of
| actionable human health.
|
| I mean. Sure, sequencing the human genome didn't solve
| our problem overnight, and you can't sequence a genome at
| a vending machine for a nickel to tell your future, but I
| think there has been an avalanche of medical data derived
| from the genome and that is only continue to get bigger.
|
| Now that we are really starting to figure out the
| polygenic risks and the single deleterious variants and
| their links with phenotype, people will have a much
| better picture of what their future might hold (and how
| to prevent it).
|
| I don't think it was ever a bluff. The problem just
| turned out harder than we thought it was going to be.
| dekhn wrote:
| it didn't turn out to be harder than _I_ thought it was
| going to be. I came into this in the 90s fully prepared
| for the idea of polygenic risk. In my opinion, most
| people who did molecular biology first think that way,
| while most people who learned mendelian genetics don 't.
|
| I had my genome sequenced a few years ago by Illumina.
| They had a big slick presentation, blah blah blah, ApoE1,
| etc. When the genetic counsellors came to my genome they
| said "huh. you don't have any risk factors". I checked
| and each of their risks was from an existing gene panel,
| so the WGS wasn't valuable (it's on PGP, if you want to
| work with it https://my.pgp-hms.org/profile/hu80855C).
|
| I talked in more detail with the counsellors. Turns out,
| whenever they saw a novel variant that wasn't covered by
| a gene panel they were googling the variant and skimming
| the abstracts of papers.
|
| It was at that point I realized the difference between
| research, PR, and actionable medical data.
| ngcc_hk wrote:
| All great until this was used for people control. Collecting
| dna which you cannot control and even can trace your race or
| relatives.
|
| We have internet. Great. But look at the dark side. DNA is
| great like target medicine but you have totalitarian regime
| which might use it.
|
| Need some sort of awareness. How to deal with the two sides,
| let us discuss once you know there is a very dark side to it.
| samchorlton wrote:
| Thanks for your concern. All technologies come with benefits
| and risks. Of course, DNA sequencing can be used for harmful
| purposes, eg. tracking individuals. We should be very
| cautious of these risks as the technology develops, and take
| well thought out steps to mitigate them. A similar analogy
| can be made to the internet and tracking people. Overall,
| however, the benefits of DNA sequencing to society already
| far outweigh these risks.
| ordu wrote:
| _> If you try to commercialize it, that takes a while to start a
| company, and it can take so long that by the time you go to the
| mechanics of that, the next thing has already emerged._
|
| Technological singularity is here! :)
|
| [1] https://en.wikipedia.org/wiki/Technological_singularity
| Ovah wrote:
| Anyone with hands on experience using NanoPore? I've been
| thinking about buying one of these to play around with. But
| anecdotally I've heard that they lack utility or are my concerns
| just myths? a) they are designed to handle many batched samples
| at once rather than many runs of few samples over time. So in
| practice they don't really last for many individual samples. b)
| the computational requirements are high. So while a NanoPore can
| be plugged into a laptop in the field it would take forever to
| run the data processing on said computer.
| bioinformatics wrote:
| Computational requirements are quite high, but OK if you have
| good GPUs on hand. A coronavirus sequenced sample on the fast
| mode without GPUs would take 3-4 hours to complete, while on
| the high accuracy mode days. GPU access would speed up
| performance considerably.
|
| Error rate for MinIONs is still quite high (10-15%), so a human
| genome sequencing would be quite inaccurate in some regions.
|
| Sequencer is quite cheap, reagents and flow cells are a little
| bit more expensive.
| gnramires wrote:
| Is the error rate per base pair?
| Ovah wrote:
| Thank you. The upfront cost of the sequencer sure makes it
| tempting at first sight.
|
| My desired hobbyist use case is to key out plants, lichens
| and mushrooms that I find in the field. I have the
| bioinformatics knowhow just need the hardware. 3-4h seems
| lika a long time for a genome that is <30k nucleotides long.
| Mushrooms on average seem to have almost as many genes as
| coronaviruses has nucleotides. I guess partial sequences (and
| thus reduced comp time?) might do the trick but it's probably
| hard to target those partial reference sequences with a long-
| read method like NanoPore.
| alwaysdoit wrote:
| If you repeat the process many times will it reduce that
| error rate, or are the errors non-independent?
| staplung wrote:
| Unfortunately, with nanopore the errors are biased so you
| tend to get errors in the same places. All sequencing
| techniques also have error rates but some are unbiased so
| running a single sample through (which will usually have
| many, many copies of any sequence) will average out to a
| good read of the sequence.
|
| Some good info on next-gen sequencing techniques:
| https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3841808/
| tdido wrote:
| Still, some of the errors can be compensated for with
| more coverage. So if you can manage 20-30X you're left
| with the homopolymer problem (nanopores can't tell how
| long a stretch of the same repeated nucleotide is,
| because you can't control how long the sensed kmer stays
| in the pore), but lots of other types can be improved
| quite a lot.
| alextheparrot wrote:
| Last time I looked into Nanopore the cost wasn't that much
| better where you'd even consider this experiment.
|
| On the other hand, when doing a genome assembly, the
| Nanopore reads are good for a draft sequence and then the
| Illumina reads can be used to polish the sequence.
| z991 wrote:
| Here's my write-up of buying one for fun:
| https://abarry.org/dna-sequencing-in-our-extra-bedroom/
| carlsborg wrote:
| From your post the thing uploads the sequenced data and their
| service generates the report. Is the raw data available?
|
| Also: truly remarkable phd thesis!
| ipsum2 wrote:
| Nice write up! How much did it cost to get a Oxford Nanopore?
| carlsborg wrote:
| $1k prices are on the website.
| danpalmer wrote:
| A close friend of mine has worked there for many years. We've
| spoken a lot about the tech.
|
| I don't know the answers to all your questions, but I do know
| that the emphasis is on research, not consumer (or hobbyist)
| use. I believe the devices are ~free, but each run requires
| using a consumable part that has to be either disposed of or
| returned for refurbishment, and I believe these are hundreds of
| dollars each.
|
| The big advancement is the size and cost of the devices, the
| fact that a lab can have one on every desk rather than a
| communal machine that you have to queue your samples up for, or
| a device you can transport in field kit.
|
| They do have cloud services that do much of the processing for
| you, but I suspect you'd want to be able to manipulate the data
| so you'd need your own data processing tools locally. It's not
| going to give you a 23andMe style report, it's more likely to
| say "yep, that's a human" vs "you're ecoli". I believe they do
| have training for how to do this data analysis, but I suspect
| this is targeted at customers on large contracts.
| Ovah wrote:
| Thank you for the practical insight. I suspected that
| NanoPores are not just yet geared towards hobbyists. I happen
| to have some bioinformatics knowhow so it's mainly a matter
| of hardware for me. As both you, u/bioinformatics and
| u/searine mention it is the overhead cost of flow cells etc
| that worries me from a hobbyist point of view.
| marsdentech wrote:
| I used to run a department at a biotech where ~50% of our data
| came from MinIONs (although, that said, I'm a bioinformatician,
| rather than a molecular biologist), so I can answer your
| questions. For (a.), you can for sure "batch" samples. The term
| of art you're looking for is "multiplexing". Nanopore provide
| prep kits that allow you to "barcode" different samples (i.e.
| tag all the molecules in a given sample with a unique,
| synthetic sequence, which allows them to be distinguished by
| software downstream), but note that (as with all DNA prep kits,
| but some more than others) you'll need access to a fair whack
| of lab equipment and consumables to use it (these kits aren't
| "all-in"). For (b.), for one anecdata point, I used to process
| a whole flow cell's data on an M4800 with a 4th Gen i7 and 32
| GB of RAM in a few hours. Most of the "high" computational
| requirements you hear about relate to either assembly or
| variant calling (both of which are downstream of just
| retrieving "usable" sequencing data); and even both of those
| I've managed on that same laptop overnight. Actually acquiring
| the data (you can delay base calling if you like, although you
| probably wouldn't need to) is real-time and only needs very
| modest hardware (IMHO the Nanopore "system requirements" are
| very much on the "safe-side".) "In the field", your challenge
| would be physically preparing the samples!
| searine wrote:
| They are a fun tool, great for doing molecular work in the
| field. The error rate is still very high compared to short
| reads, but if you know this and plan for it going in you should
| be fine.
|
| Flowcells last for one sample. The machine should last
| indefinitely. You can sometimes add more of the same DNA to a
| flowcell after one use to get a bit more out of it, but the
| quality degrades quickly. 500-1000 dollars each for flowcells,
| depending on how much you order.
|
| My experience in field use, I was using Oxford Nanopores
| software which does processing remotely and was able to run the
| the platform on just a regular 2015-era laptop.
| twobitshifter wrote:
| What is a flow cell made out of and why is the cost so high?
| searine wrote:
| It's made of plastic, glass, and the special protein pores
| which split the strand and read the DNA. Reagents and
| sample are applied to it to make the reaction happen.
|
| The flowcell gets contaminated with your sample after one
| run so they are 'one time use'. The nanonpore protein
| eventually stops working also.
|
| They are expensive because doing molecular biology is
| expensive. It requires expensive machines and expensive
| reagents at atomic scales to create. Thus money is
| required.
| tdido wrote:
| Actually, one of the main features of this tech apart
| from the obvious size-factor is that it's a streaming
| process. You can analyse data on the fly and decide when
| to stop the run. Wash the flowcell, and use it for
| another sample. Eventually the pores die, yes, how fast
| depends on the sample type. I think they guarantee 48
| hours or something of the sort.
|
| The expensive part is not the chemistry. Each flowcell
| has a very expensive piece of metal that senses the very
| small current variations that each kmer causes when going
| through each pore. They've actually come up with a device
| (horribly named "flongle") that has the same shape of a
| flowcell but no pores, and the mini flowcell it uses is
| ~90USD (against ~900USD for a full flowcell). Of course,
| yield is much lower.
| nojokes wrote:
| Is the price a question of scale? If this technology would
| become commonplace, would the price go down? Are there
| patents that would prevent cheaper chemical production?
| hobofan wrote:
| I assume scale and more R&D on how to produce nanopores
| more cheaply would be the main ways to drive price down. As
| for patents, Oxford Nanopore has a pretty big portfolio for
| all things nanopore, so a direct competitor based on
| nanopores that would drive the price down seems unlikely
| (though they obviously have to compete on price with other
| sequencing methods to some degree).
| phkahler wrote:
| How does it handle repeats? I can understand reading AACCCT...
| since they say the signal depends on several letters. But what
| about 12 Gs? Or longer runs of the same letter. Is the some way
| to clock one nucleotide at a time?
| tdido wrote:
| Nope. You're working with kmers. I think it's 6mers in the
| current models. It's good because you get redundancy as you
| move, but coupled with the fact that you can't control dwelling
| time it makes repetition hard to handle.
| marsdentech wrote:
| As others have said, you're reading a sliding window of k-mers
| over the target sequence; I think for the MinION k is presently
| 5. To answer your question directly, it struggles with
| homopolymer runs, not inherently because they're low
| complexity, but actually because it's tricky to "clock" how
| many like, contiguous k-mers have passed through the pore after
| a given period of time. That is to say, for example, if your
| target sequence is "GGGGGGG" (i.e. a homopolymer run of 7 Gs),
| you'd expect to observe three like, contiguous signals (i.e. in
| current space) for the all-G 5-mer, one signal each per "clock
| cycle" (which corresponds to the dwell time of the k-mer in the
| pore). If these "clock cycles" were always constant, it's
| merely a case of dividing the "time spent on the observed all-G
| 5-mer" signal by the the "time spent on one clock cycle".
| Sadly, for our purposes, there's enough wobble in any one such
| "clock cycle" that that calculation won't always yield a
| reliable result. The upshot: your "GGGGGGG" (7 Gs) target
| sequence may be registered as "GGGGGG" (6 Gs) or "GGGGGGGG" (8
| Gs), or even something else. Now, for distinguishing two
| alleles where the difference between them is, say, a doubling
| in length of an already-very-long homopolymer run, even with
| the aforementioned "clock wobble", you'd likely be able to see
| that in MinION data quite clearly. As with all thing DNA
| sequencing (for the time being, at least!), your precise
| biological question will determine which (one or more)
| sequencing techniques are best for the job!
| phkahler wrote:
| Just a thought. If the DNA were run through 2 such holes, you
| could use a nearby non-uniform sequence to clock the reading
| of the other one. Not a magic bullet, but maybe an
| improvement. Assumes the readers can be close enough to bound
| the amount of slack between them, and that they dont
| interfere with each other.
| RocketSyntax wrote:
| Maybe if you ran the test 100 times and did some pileups by
| position it would be usable in comparison to WGS
| koeng wrote:
| If you want to see what a real run looks like, here is a little
| gist of my last Nanopore run, raw basecall -> alignment (no
| consensus)
|
| https://gist.github.com/Koeng101/abc674e1acd575646748afcbcc7...
| RocketSyntax wrote:
| Also, this is not new. It's been around for yrs
| JabavuAdams wrote:
| Maybe, but it was a good summary for me and I've been in
| biophysics for 3 years or so. Also, lots of good keywords and
| discussion generated here to follow up on. Overall, very useful
| article and discussion.
| dekhn wrote:
| schatz periodically dumps PR for attention
| u678u wrote:
| Wow I never thought of this. I understand all the controversy
| over 23&me and DNA secrecy, but it seems pretty soon it'll be
| trivial to run DNA anywhere anytime.
| garettmd wrote:
| I'm wondering about the impacts of cheap/accessible DNA
| sequencing in the future. Not just impacts to existing
| businesses, but what does it mean from a privacy perspective?
| If someone could take a strand of your hair and then get your
| genome sequence from it - what would be the implications?
| pishpash wrote:
| In the long future: total loss of privacy and identity as
| meaningful concepts.
| devops000 wrote:
| Could DNA sequence be used as a private key / seed for a Bitcoin
| wallet? It does make sense?
| koeng wrote:
| At the 2014 DEFCON Biohacking village I did exactly that. I
| gave out like 50 tubes of plasmid, all you had to do is go
| sequence em to extract the private key, and boom, you get like
| $200 (or like 15K today...)
|
| Literally nobody did it for a couple years, so I ended up
| taking out the bitcoin to pay for more DNA synthesis a few
| years ago. I actually did delete the bitcoin private key
| though, so I had to pay for sequencing it back out...
| a-dub wrote:
| what was your encoding scheme? hash of some character
| representation was the key?
| koeng wrote:
| 2 base pairs per byte mapping. Super simple.
| blamestross wrote:
| Same problem as all biometrics. Data about you makes for a bad
| password. It can make an ok username tho.
| WanderPanda wrote:
| Memorising the seedwords of one key + a backup key in a 1 of 2
| multisig setup seem to be a good alternative.
| CapitalistCartr wrote:
| Any password-like object has to be changable. And easily.
| Abishek_Muthian wrote:
| The advancement in DNA sequencing tech for humans, have been a
| boon for fighting extinction of other animals too. Sequencing
| bird DNA from feathers to determine their migration and check
| population was envisioned decades ago and has only been made
| possible recently to the advancement of the tech.
|
| The Bird Genoscape Project[1] was also showcased in this
| excellent Nat Geo video[2].
|
| [1]https://www.birdgenoscape.org/
|
| [2]https://www.youtube.com/watch?v=_p43ksRgIlk
| tingletech wrote:
| seems pretty impressive. Here is the code linked in the article
| that does the signal processing to decode the sensor data into
| DNA sequences. https://github.com/skovaka/UNCALLED
| lifeisstillgood wrote:
| My first reaction after reaching the halfway point in the article
| was to check it was not April 1st already.
|
| But even on a site like Stackoverflow (hey I can trust Joel
| right?), and even after coming here and reading "hey yes we build
| / use those too" I am struggling to believe this.
|
| What else don't I know about in biotech? How far ahead is the
| industry compared to where the average man on the clapham omnibus
| thinks it is.
|
| Please stop the world I want to get off.
| koeng wrote:
| Until you also realize you need a Qubit and the library preps and
| oh now you need NEB next gen enzymes and wow turns out pipette
| technique really matters.
|
| That said, I love Nanopores, I use them in my business, and those
| error rates you can hack around if you know what's going on under
| the hood.
| tdido wrote:
| I don't think you need the Qubit with the rapid prep.
| koeng wrote:
| it works but your efficiency drops by quite a bit
| Florin_Andrei wrote:
| > _those error rates_
|
| Do a thousand readings, fix the parts that don't match across
| the board?
| dekhn wrote:
| "wow turns out pipette technique really matters" <- one of the
| most underrated comments of all time.
| jacquesm wrote:
| Boris Johnson gives a nice demonstration here:
|
| https://twitter.com/neilhall_uk/status/1355088791220985857
| dekhn wrote:
| The worst for me was coming in early and setting up gels.
| I'd drink a bunch of coffee, have shaky hands, and then
| break the gel with the pipette tip repeatedly while trying
| to jam the dna into the well.
|
| there's a reason I went into automated biological robots.
| andi999 wrote:
| Pipette skills improve rapidly if you practice with a
| microscale.
| dekhn wrote:
| that's how we calibrated ours. turns out: most pipettes in
| the lab were miscalibrated, with 50+% error. Then it turned
| out our scale wasnt properly calibrated, so we had to
| replace that too.
| samchorlton wrote:
| Exactly. Better analytics can enable this technology to produce
| better results than competing technologies in less time. Once
| automated/easy/rapid sample prep comes, there will be mass
| adoption in the space.
|
| Disclaimer: Co-Founder of BugSeq[0] 0: https://bugseq.com
| matthew_stone wrote:
| > Once automated/easy/rapid sample prep comes, there will be
| mass adoption in the space.
|
| Sounds like Elon calling biology a "software problem".
|
| Not saying that you're wrong, just saying that the
| computational folk tend to discount the challenges and skills
| required in the wet lab.
| samchorlton wrote:
| Agreed - Definitely a different class of problem than
| "software". There are large barriers, eg. lab
| contamination, biocontainment, low input protocols, etc;
| however, technological innovation _will_ help with these.
|
| That being said, we see a future where someone without
| advanced molecular training can put a sample (whether
| that's a nasal swab, concerning white powder received in
| the mail or lab-grown meat) in a black box and get out a
| meaningful report.
| phkahler wrote:
| >> Not saying that you're wrong, just saying that the
| computational folk tend to discount the challenges and
| skills required in the wet lab.
|
| It's time to bring in the industrial automation folks. They
| probably won't invent a fancy new algorithm to reduce the
| time to splice the pieces together, but they'll fine tune
| and automate your reader to the 9's.
| koeng wrote:
| Yea automated sample preps are key for me. The main thing
| that is overlooked in synthetic biology about nanopore is it
| has the capability to dramatically lower cost of indexing,
| which turns out to be one of the main prohibiting costs for
| dropping the cost of plasmid production.
___________________________________________________________________
(page generated 2021-02-03 23:01 UTC)