[HN Gopher] AlphaFold won't revolutionise drug discovery
___________________________________________________________________
AlphaFold won't revolutionise drug discovery
Author : panabee
Score : 79 points
Date : 2022-08-06 19:35 UTC (3 hours ago)
(HTM) web link (www.chemistryworld.com)
(TXT) w3m dump (www.chemistryworld.com)
| pcrh wrote:
| This article makes some good points, but is incorrect on some
| others.
|
| In particular the statement " It is very, very rare for knowledge
| of a protein's structure to be any sort of rate-limiting step in
| a drug discovery project!" does not reflect the realities of drug
| discovery.
|
| Knowing a protein's structure, and its structure when complexed
| with ligands/drugs, is a massively important bit of data in the
| armoury of medicinal chemists, which Derek Lowe knows all too
| well.
|
| Of course, it may be that knowing which protein to target is more
| important, and that problem isn't affected by AlphaFold.
| aurizon wrote:
| I think Alphafold will be of immense value. The article is too
| pessimistic. Alphafold will reveal a huge number of structural
| parameters that can be exploited by careful synthetic chemistry.
| You can know what base pair to change and see if that changes or
| blocks function. Things like Paxlovid can be tweaked towards an
| optimum. It is an analog to the way the Rosetta stock unlocked a
| few ancient languages. With this tool a huge number of testable
| structures will be amenable to tweaking, since we now have online
| ordering of almost any sequence. The tedious step will be the wet
| testing, but that has been solved by the multiple well test
| slides.We can now sequence and protein via dye/pore/emf methods.
| https://en.wikipedia.org/wiki/Nanopore_sequencing
| zmmmmm wrote:
| This take seems to be somewhat over sceptical and slightly over-
| reaching to me.
|
| > when your entire computational technique is built on finding
| analogies to known structures, what can you do when there's no
| structure to compare to
|
| Lots of people seem focused on the idea that deep networks can't
| do anything novel and are just like fancy search engines that
| find a similar example and copy it. This is _not true_. They do
| learn from much deeper low level structures in the domain they
| are exposed to. They can be aware of implicit correlations and
| constraints that are totally outside what may be recognised in
| the scientific understanding. Hence AlphaFold is quite capable of
| predicting a structure for which there is no previous direct
| "analogy". As long as the protein has to follow the laws of
| physics then AlphaFold as at least a basis to work from in
| successfully predicting the structure.
|
| > It is very, very rare for knowledge of a protein's structure to
| be any sort of rate-limiting step in a drug discovery project!
|
| This and the following text are very reductive. It's like saying,
| back in 1945 that nuclear weapons would not be any sort of
| advantage in WW2 because it is very rare for weapons of mass
| destruction to win a war. Well yes it was rare, because they
| didn't exist. And so too did we not have a meaningfully accurate
| way to predict protein structures until AlphaFold. We've barely
| even begun to exploit the possible new opportunities for how to
| use that. And people have barely scratched the surface in
| adapting AlphaFold to tackle the related challenges downstream
| from straight up structure prediction. Predicting formation of
| complexes and interactions is the obvious next step and it's
| exactly what people are doing.
|
| It's not to say that it _will_ revolutionise drug development,
| but the author 's argument here is that he is confident it _will
| not_ and he really doesn 't assert much evidence of that.
| ramraj07 wrote:
| If you're gonna get mad and quote a sentence to rail on the
| author, at the least quote the full sentence: the author ends
| it with "and there never will be." Because among other things
| he's talking about intrinsically disordered proteins[1]. What
| can the best prediction model do to predict the truly
| unpredictable? Just tell us that it's unpredictable.
|
| And what is your second criticism exactly? The author comes
| from the drug discovery industry. What the author said should
| be generalized to: even if we know the perfect experimentally
| confirmed 1A resolution structure of every protein out there
| tomorrow, that won't exactly revolutionize drug discovery.
| That's because protein structure gives maybe 10% of the context
| you need to successfully design a drug. It's dynamics, higher
| order interaction specifics, complex interplay in signaling
| pathways in particular cells in particular contexts and what
| entire cells and organ systems in THAT PARTICULAR ORGANISM do
| when this protein is perturbed, are what truly affects drug
| discovery.
|
| If you absolutely want to revolutionize DD, find us a better
| model to test things on that's closer to the human body as a
| whole. Currently mice and rats are used and they're not cutting
| it anymore.
|
| This fundamentally goes back to the downfall of the
| prototypical math or software guy trying to come and say "im
| gonna cure cancer with MATH!" No you're not. You're gonna help,
| and it's appreciated, but if you're gonna truly cure cancer you
| better start stomping on a few thousand mice and maybe also get
| an MD.
|
| 1. https://www.nature.com/articles/nrm3920
| vcdimension wrote:
| This proposed improvement in the way RCT's are conducted could
| have a big impact on the speed of drug discovery:
| https://arxiv.org/pdf/1810.02876.pdf
| bilsbie wrote:
| > Forming these coils, loops, and sheets is what proteins
| generally do, but 'why?' doesn't enter into it.
|
| How do we know the model hasn't figured out some of the 'whys'
| somewhere in there?
| bigdict wrote:
| Because it learns a conditional distribution. It doesn't work
| on figuring out why the distribution is the way it is.
| lrem wrote:
| Your question fundamentally falls into the area of unanswerable
| philosophy akin to "do insects feel pain?"
|
| But there's a reasonable intuition suggesting that the answer
| to your question is "no". What we're looking at is a non-linear
| regression model reproducing the function (which according to
| the article isn't really a function, but that's above both my
| and the model's knowledge) from a gene sequence to a 3d
| structure. It is heavily meta-optimised, so the "why's" would
| only be in the model, if reproducing the process of folding the
| protein was the cheapest way to guess the structure ().
| Intuitively it introduces at least one extra dimension, so
| should be way more expensive than finding analogues among known
| sub-aspects of the function. Hence, I would expect none of the
| "why's" to be in there.
|
| Sadly, if any _insight_ for the "why's" was there after all,
| we don't have a method to extract it anyway.
|
| Disclaimer: I work in Google, far away from DeepMind, have no
| internal knowledge on this.
| salty_biscuits wrote:
| "Sadly, if any insight for the "why's" was there after all,
| we don't have a method to extract it anyway."
|
| This has been my central frustration with working in ML.
| People always expect a "why" to exist, and by why I mean a
| cogent narrative explanation to complex phenomena. Maybe
| there is no "why" like this for a bunch of physical
| phenomena, maybe it is just a bunch of low level intricate
| stuff interacting in complex ways. There might be an emergent
| model that you can get a useful predictive model for with an
| ML model, then people get mad because the prediction doesn't
| solve the real meta problem that they were expecting to solve
| via the sub problem (e.g. solve folding then get mad because
| folding itself turns out not to be super useful because we
| don't know which protein to target, solve image
| classification then get mad because that doesn't make it easy
| to make a self driving car, etc, etc). "More is different" is
| definitely an idea in physics that needs to propagate into
| other fields to temper our expectations.
| sgt101 wrote:
| The "why" is a bit of an odd question anyway - the structure is
| as as the structure is. It's like asking why "red hears a
| galaxy", just words.
| freemint wrote:
| Well, no. A bunch of mechanism have models of lower
| complexity that have almost exactly the same predictive power
| but a completely different structures. Those higher order
| structures are the "why".
|
| Why did does a cube on a inclined plane start to slide? You
| act like the correct answer is "because the subatomic
| particles and space time in the light cone of the experiment
| made it that way" when one should expect "because the sin of
| the incline angle times mass times local gravity became
| bigger then the static friction between a cos(incline angle)
| times the original cube weight and the surface at no incline"
| which is a lot simpler.
| pelorat wrote:
| It's probably not even possible for a human to understand the
| "why"
| summerlight wrote:
| The modern world is not that simple enough to allow a single
| paper or technology to revolutionize anything. I don't understand
| why people are reiterating this obvious fact over and over? Most
| of the technological breakthroughs are usually a culmination of
| decades of research and investments.
| evouga wrote:
| My observation is that breathless hype pieces proclaiming that
| a new technology will imminently revolutionize area X outnumber
| the articles expressing common-sense skepticism about the
| technology, by two to three orders of magnitude.
| xiphias2 wrote:
| While the article is correct that knowing the protein structures
| in itself is not that interesting, it's a prerequisite step to
| predicting interactions between proteins, which is super
| interesting for drug discovery.
|
| What's encouraging is the rate of progress, not what has already
| been done.
| AlbertCory wrote:
| In the early 2000s, I took a bunch of UCSC Extension courses on
| mol bio, bioinformatics, and drug discovery. Back then, abundant
| DNA information was the thing revolutionizing the field.
|
| What the scientists (all from Roche) said was, more or less,
| "yeah, that helps a lot. It doesn't solve the whole problem,
| though."
|
| 20 years later they've gotten yet more help with Alphafold. Once
| again, they can do things faster, but it isn't a Moore's Law-type
| of change. It's still a really hard problem demanding culture,
| animal, and human tests, and those take time and money.
| aabhay wrote:
| The author doesn't answer the question. If not this, then what
| will? Because as far as I can tell, we're nowhere near extracting
| the full value of AI-generated protein structures. Why plant this
| flag and be wrong later if you have no real idea of what should
| be done instead?
| ChrisRackauckas wrote:
| The questions he asks here are exactly the questions that
| quantitative systems pharmacology (QSP) seeks to answer (and as
| a result, it's booming as a field). Just because you can build
| a drug to inactivate said protein doesn't mean you should. 85%
| of clinical trials fail as he states, and one of the main
| reasons why is because the target ends up being incorrect.
| Targeting some protein because a lot of it seems to exist when
| a given disease is occurring might end up targeting the symptom
| instead of the cause. Understanding how the complex systems
| interact, their feedbacks and their nonlinearities, is
| essential to knowing what needs to be targeted. We had already
| been able to quickly create new drug candidates, and with
| protein folding predictions we can now do that even faster.
| Those drugs can be tested in a lab to see if they bind to the
| proteins they're supposed to, and they keep getting quicker at
| hitting exactly the function they expected. But without making
| the billion dollar clinical trial more likely to be solving the
| actual problem, we're still going to be limited by "okay, so
| what in this pool of possible drugs should we risk trying
| next"? We can accurately knock out protein function, but we're
| still fishing in the dark when it comes to how to actually fix
| and regulate bodies.
| echelon wrote:
| Because we have to be honest with ourselves. Don't tell the
| crystallographers they're no longer necessary for structure
| determination. If people flee the field and ML doesn't pan out,
| then we're worse off.
|
| Treat this as it is. An exciting approach that may help some
| now and yield fantastic results in the future. Don't count the
| chickens before they hatch.
|
| Even if the structures were entirely correct - and they're
| definitely not - there's a massive complex metabolome to figure
| out.
|
| Google is certainly milking the PR as much as they can, and
| that can be dangerous to the laymen approving research budgets.
| tigershark wrote:
| Alphafold demonstrated beyond any reasonable doubt that
| crystallography by itself is useless in certain
| circumstances. There are plenty of research groups working on
| crystallography that found the correct solution only
| combining their data with Alphafold data. In the last
| competition, if I remember correctly, there was one protein
| that escaped crystallography for many years until they used
| Alphafold predicted structure. I'm not really sure how can
| you simply discount these really groundbreaking results when
| crystallography provided much less wins in many more years.
| l33tman wrote:
| You are aware of that almost all known protein structures
| come from crystallography?
| pas wrote:
| How much of the AlphaFold training data is from
| crystallography results?
| evouga wrote:
| The crystallographically-determined structures are the
| _ground truth_! [1]
|
| Saying that AlphaFold makes x-ray crystallography useless
| is like saying DALL-E makes photography useless or Copilot
| makes GitHub useless. You've got the dependency chain
| backwards.
|
| [1] (Or at least, they're treated as the ground truth---
| they don't necessarily predict the conformation of proteins
| in solution, but that's a separate topic for another
| thread).
| freemint wrote:
| Near real time (max 100 times slower then real time)
| differentiable, stochastic multi organ simulations with
| chemically accurate time and environment depending dynamic
| structure changes at all possible binding targets or
| interactions with body own components and third party drugs.
|
| Without machine learning at every atom is dynamic precision we
| are at 10^-18 L (liters) at 20 micro seconds a week with a
| specialised super computer
| https://dl.acm.org/doi/abs/10.1145/3458817.3487397 .
|
| A solution does not need that precision everywhere. However a
| machine learning proxy of such precision in every relevant
| environment including 2d surface along non mixing fluid etc for
| every likely type of interaction is required so we can be
| certain of the possible outcomes.
|
| That would allow humanity to pre-screen a bunch of edge
| conditions and check for unintended or previously explained
| side effects. The derived surrogates for environment dependent
| reaction rates could be used in a spatially distributed event
| based simulations with level of precision ranging from atoms
| with position and electrons in orbits subject to electro-
| magnetic force interaction, molecules as things with position
| and rotation and folding state, concentration gradients of
| those as stochastic 3d PDEs, 2d PDEs, 1d PDEs and ODEs of the
| number of moles with relevant boundary conditions. If we had
| those reaction rates down and knew of all the proteins and
| other structures i am positive that a proxy model of relevant
| parts of the human body could achieve enough accuracy to be
| practical at pre-screening drugs with todays super computers.
| dekhn wrote:
| To revolutionize drug discovery, you need to solve a number of
| problems that ML can't really address right now.
|
| We do not have well-formed theories of the molecular details of
| many diseases. There is no immediate computational approach that
| address this defect. The community has had fairly simplified
| models for some time, and there's a lot of historical belief that
| by knowing protein structures in details, we can understand the
| nature of a disease through its molecular etiology, and from
| that, we can make drugs (either small molecules or biomolecules)
| that modulate proteins in rational ways to eliminate the disease
| with a minimum of side effects.
|
| In my mind, much of the problem is similar to modern deep
| learning compared to previous techniques. Several extremely
| challenging problems (high accuracy voice recognition, image
| recognition, object detection) simple were not solvable through
| the statistical techniques and mental models adopted by the
| practitioners. It is not abundantly obvious that stupidly simple
| deep networks can be pretrained on enormous amounts of labelled
| data, or even unlabelled data, but we didn't even have the
| ability to know this confidently until we had the right network
| architectures, enough high quality labelled data, and adequate
| compute power to train them.
|
| I believe that by starting to think about disease modelling from
| the same mindset as deep learning (simple models with many
| parameters, the models don't actually represent the assumed
| mechanism, large amounts of high quality data, lots of CPU, GPU,
| and RAM) and also thinking of the disease treatment process in
| the same way will greatly increase our ability to "understand"
| and "treat" diseases, while knowing far less about their
| underlying mechanism that we thought.
|
| A common example is disease/patient stratification. If you've
| developed drugs that treat disease A, but it turns out later,
| there are really two diseases, A1 and A2 with different
| underlying mechanisms but superficially similar exterior
| symptoms, you'll realize why some percentage of your population
| didn't get better (and often got worse, given the underlying
| toxicity of some medicines). If we could just stratify diseases
| better, and classify patients into the right bins, the
| effectiveness of medicine will go up (and drugs will get through
| clinical trials faster/better).
|
| None of this addresses the later-stage issues, such as
| successfully running all the phases of a clinical trial and the
| other gauntlets you must pass in order to get a drug FDA-
| approved.
|
| I would continue to expect marginal improvements for the
| foreseeable future. But be aware: some companies already have
| managed to do a good enough job developing new medicines that
| they routinely create multi-billion-dollar blockbuster drugs year
| after year after year (my employer, Genentech, is a perfect
| example of that). It maintains an enormous and well-funded R&D
| arm that expends untold neurons attempting to understanding
| disease better even before we start to consider something as
| "druggable".
| curious_cat_163 wrote:
| Mapping DNA sequences to 3D protein structure is the problem that
| the AlphaFold tries to solve. I don't think it tries to solve for
| "drug discovery".
|
| I suspect that, like any ML problem, this one is a small part of
| the whole solution of drug discovery. There are always system-
| level dynamics at play.
|
| To me, some relevant questions before deciding to take on an ML
| problem tend to be:
|
| [x] Does solving it eliminate manual labor from the process? [x]
| Does it save $ in the progress towards solving the whole problem?
| [x] Is it fun to solve it?
| microSnowball wrote:
| I think alphafold gets hated on too much. It won't revolutionize
| things but I bet people are out there right now looking at
| different structures and motifs only seen on alphafold to get a
| better idea on how existing drugs bind and affect them. And then
| designing analogues and so on. Time will tell, I guess.
|
| It's kind of like anything in research, lots of small steps
| enable revolutionary breakthroughs every so often.
| fabian2k wrote:
| You can assume that any known drug target has experimentally
| determined structures available, once you spend the enormous
| amounts of effort necessary to put a drug through real clinical
| trials the effort to determine the target structure is pretty
| much irrelevant.
|
| Of course there are plenty of drugs where we either don't know
| where they bind or we're probably wrong about where we think
| they bind. Or they bind at multiple places and some desirable
| or non-desirable effect are due to binding at places we don't
| know yet.
|
| There are real uses to having lots of high-quality structure
| predictions for proteins. Drug development is something that
| only get limited benefits here. If you want to know how drugs
| or drug candidates bind to proteins you first create a protein
| structure with X-ray crystallography. Then you soak your
| crystals with your drugs or drug candidates and determine even
| more structures. The interesting part here is not necessarily
| the overall fold of the protein (which is mostly what AlphaFold
| gives you) but e.g. a single hydrogen bond to the drug in the
| active pocket of the target protein. You need really high-
| quality data if you want to do any kind of rational drug
| design, most of the time we still just semi-randomly vary
| structures until they bind better as far as I understand.
| epistasis wrote:
| I think it gets marketed too much and hated on too much.
|
| Given the utter dominance of Google advertising, I think the
| hating is a necessary counter in order to at least place it in
| its right place.
|
| Whatever skill Google has computationally is more than matched
| by their media dominance and public relations prowess.
| mtlmtlmtlmtl wrote:
| I find this view very strange. If you apply the same logic to
| politics, the outcome is pretty grim. And we've been seeing
| more and more of that.
|
| I don't like hype or hate that's devoid of nuance. But actual
| scientists working in these fields don't generally pay
| attention to these things as much as we might. They read the
| papers, and they have years of training to help them decide
| what is overhyped and what isn't. I'm not sure what happens
| on HN or in advertising channels has such a huge bearing on
| this.
| [deleted]
| lrem wrote:
| I _love_ this article. It nicely answers the question I posed
| (https://news.ycombinator.com/threads?id=lrem#32263287) in the
| discussion of the original announcement: is today's db good
| enough to be a breakthrough for something useful, e.g. pharma or
| agriculture? And the answer, somewhat unsurprisingly, seems to be
| "useful, but not life-changing". And that's a perfectly good
| result in my eyes :)
| frozencell wrote:
| OpenAI's GPT-3 and DALL*E2 might be life-changing for their
| creative users, writers and illustrators or beginner creators,
| I can't remember any life-changing use case for groups (outside
| of the creators themselves). For ML researchers, transformers
| seem to not be used as AGI at all (despite general or multi-
| modal potential) but mostly used for test and probability tool.
| p1esk wrote:
| _For ML researchers, transformers seem to not be used as AGI
| at all (despite general or multi-modal potential) but mostly
| used for test and probability tool._
|
| What do you mean?
| SilasX wrote:
| That link goes to the top of your comment history. I think you
| want this link to ensure you see the right comment:
|
| https://news.ycombinator.com/item?id=32263287
___________________________________________________________________
(page generated 2022-08-06 23:00 UTC)