[HN Gopher] AlphaFold 3 predicts the structure and interactions ...
___________________________________________________________________
AlphaFold 3 predicts the structure and interactions of life's
molecules
Author : zerojames
Score : 683 points
Date : 2024-05-08 15:07 UTC (7 hours ago)
(HTM) web link (blog.google)
(TXT) w3m dump (blog.google)
| s1artibartfast wrote:
| The article was heavy on the free research aspect, but light on
| the commercial application.
|
| I'm curious about the business strategy. Does Google intend to
| license out tools, partner, or consult for commercial partners?
| ilrwbwrkhv wrote:
| as soon as google tries to think commercially this will shut
| down so the longer it stays pure research the better. google is
| bad with productization.
| s1artibartfast wrote:
| I don't think it was ever pure research. The article talks
| about infinity labs, which is the co. Mercial branch for drug
| discovery.
|
| I do agree that Google seems bad at commercialization, which
| is why I'm curious on what the strategy is.
|
| It is hard to see them being paid consultants or effective
| partners for pharma companies, let alone developing drugs
| themselves.
| candiodari wrote:
| I wonder what the license for RoseTTAFold is. On github you
| have:
|
| https://github.com/RosettaCommons/RoseTTAFold/blob/main/LICE...
|
| But there's also:
|
| https://files.ipd.uw.edu/pub/RoseTTAFold/Rosetta-DL_LICENSE....
|
| Which is it?
| weregiraffe wrote:
| s/predicts/attempts to predict
| jasonjmcghee wrote:
| The title OP gave accurately reflects the title of Google's
| blog post. Title should not be editorialized.
| jtbayly wrote:
| Unless the title is clickbait, which it appears this is...
| matt-attack wrote:
| Syntax error
| adrianmonk wrote:
| Legal without the trailing slash in vi!
| pbw wrote:
| A prediction is a prediction; it's not necessarily a correct
| prediction.
|
| The weatherman predicts the weather, even if he's sometimes
| wrong, we don't say "he attempts to predict" the weather.
| dekhn wrote:
| AlphaFold has been widely validated- it's now appreciated that
| its predictions are pretty damn good, with a few important
| exceptions, instances of which are addressed with the newer
| implementation.
| AtlasBarfed wrote:
| "pretty damn good"
|
| So... what percentage of the time? If you made an AI to pilot
| an airplane, how would you verify its edge conditions, you
| know, like plummeting out of the sky because it thought it
| had to nosedive?
|
| Because these AIs are black box neural networks, how do you
| know they are predicting things correctly for things that
| aren't in the training dataset?
|
| AI has so many weasel words.
| dekhn wrote:
| As mentioned elsewhere and this thread and trivially
| determinable by reading, AF2 is constantly being evaluated
| in blind predictions where the known structure is hidden
| until after the prediction. There's no weasel here; the
| process is well-understood and accepted by the larger
| community.
| Metacelsus wrote:
| From: https://www.nature.com/articles/d41586-024-01383-z
|
| >Unlike RoseTTAFold and AlphaFold2, scientists will not be able
| to run their own version of AlphaFold3, nor will the code
| underlying AlphaFold3 or other information obtained after
| training the model be made public. Instead, researchers will have
| access to an 'AlphaFold3 server', on which they can input their
| protein sequence of choice, alongside a selection of accessory
| molecules. [. . .] Scientists are currently restricted to 10
| predictions per day, and it is not possible to obtain structures
| of proteins bound to possible drugs.
|
| This is unfortunate. I wonder how long until David Baker's lab
| upgrades RoseTTAFold to catch up.
| wslh wrote:
| The AI call is rolling fast, I see similarities with
| cryptography in the 90s.
|
| I have a history to tell for the record, back in the 90s we
| developed a home banking for Palm (with a modem), it was
| impossible to perform RSA because of the speed so I contacted
| the CEO of Certicom which was the unique elliptic curve
| cryptography implementation at that time. Fast forward and ECC
| is everywhere.
| l33tman wrote:
| That sucks a bit. I was just wondering why they are touting
| that 3rd party company in their own blog post, who
| commercialise research tools, as well. Maybe there are some
| corporate agreements with them that prevents them from opening
| the system...
|
| Imagine the goodwill for humanity for releasing these pure
| research systems for free. I just have a hard time
| understanding how you can motivate to keep it closed. Let's
| hope it will be replicated by someone who doesn't have to hide
| behind the "responsible AI" curtain as it seems they are now.
|
| Are they really thinking that someone who needs to predict 11
| structures per day are more likely to be a nefarious evil
| protein guy than someone who predicts 10 structures a day? Was
| AlphaFold-2 (that was open-sourced) used by evil researchers?
| staminade wrote:
| Isomorphic Labs? That's an Alphabet owned startup run by
| Denis Hassabis that they created to commercialise the
| Alphafold work, so it's not really a 3rd party at all.
| SubiculumCode wrote:
| There is at least some difference between a monitored server
| and a privately ran one, if negative consequences are
| possible
| perihelions wrote:
| - _" Imagine the goodwill for humanity for releasing these
| pure research systems for free."_
|
| The entire point[0] is that they want to sell an API to drug-
| developer labs, at exclusive-monopoly pricing. Those labs in
| turn discover life-saving drugs, and recoup their costs from
| e.g. parents of otherwise-terminally-ill children--again,
| priced as an exclusive monopoly.
|
| [0] As signaled by _" it is not possible to obtain structures
| of proteins bound to possible drugs"_
|
| It's a massive windfall for Alphabet, and it'd be a profound
| breach of their fiduciary duties as a public company to do
| anything other than lock-down and hoard this API, and squeeze
| it for every last billion.
|
| This is a deeply, deeply, deeply broken situation.
| karencarits wrote:
| What is the current status of drugs where the major
| contribution is from AI? Are they protectable like other
| drugs? Or are they more copyless like AI art and so on?
| iknowstuff wrote:
| Is it broken if it yields new drugs? Is there a system that
| yields more? The whole point of capitalism is that it
| incentivizes this in a way that no other system does.
| l33tman wrote:
| My point one level up in the comments here, was not
| really that the system is broken, but more like asking
| how you can run these companies (google and that other
| part run by the deepmind founder, who I bet already has
| more money than he can ever spend) and still sleep well
| knowing you're the rich capitalist a-hole commercializing
| life-science work that your parent company has allocated
| maybe one part in a million of their R&D budget into
| creating.
|
| It's not like Google is ever going to make billions on
| this anyway, the alphafold algorithms are not super
| advanced and you don't require the datasets of gpt4 to
| train them so others will hopefully catch up.. though I'm
| also pretty sure it requires GPU-hours beyond what a
| typical non-profit academia outfit has available
| unfortunately.. :/
| lupire wrote:
| The parents of those otherwise terminally ill children
| disagree with you in the strongest possible terms.
| goggy_googy wrote:
| What makes this such a "deeply broken situation"?
|
| I agree that late-stage capitalism can create really tough
| situations for poor families trying to afford drugs. At the
| same time, I don't know any other incentive structure that
| would have brought us a breakthrough like AlphaFold this
| soon. For the first time in history, we have ML models that
| are beating out the scientific models by huge margins. The
| very fact that this comes out of the richest, most
| competitive country in the history of the world is not a
| coincidence.
|
| The proximate cause of the suffering for terminally-ill
| children is really the drug company's pricing. If you want
| to regulate this, though, you'll almost certainly have
| fewer breakthroughs like AlphaFold. From a utilitarian
| perspective, by preserving the existing incentive structure
| (the "deeply broken situation" as you call it), you will be
| extending the lifespans of _more people in the future_ (as
| opposed to extending lifespans of more people now by
| lowering drug prices).
| firefoxbrower wrote:
| Late-stage capitalism didn't bring us AlphaFold,
| scientists did, late-stage capitalism just brought us
| Alphabet swooping in at literally the last minute.
| Socialize the innovation because that requires potential
| losses, privatize the profits, basically. It's
| reminiscent of "Heroes of CRISPR," where Doudna and
| Charpentier are supposedly just some middle-men, because
| stepping in at the last minute with more funding is
| really what fuels innovation.
|
| AlphaFold wasn't some lone genius breakthrough that came
| out of nowhere, everything but the final steps were
| basically created in academia through public funding. The
| key insights, some combination of realizing that the
| importance of sequence to structure to function put
| analyzable constraints on sequence conservation and which
| ML models could be applied to this, were made in academia
| a long time ago. AlphaFold's training set, the PDB, is
| also a result of decades of publicly funded work. After
| that, the problem was just getting enough funding amidst
| funding cuts and inflation to optimize. David Baker at
| IPD did so relatively successfully, Jinbo Xu is less of a
| fundraiser but was able to keep up basically alone with
| one or two grad students at a time, etc. AlphaFold1 threw
| way more people and money to basically copy what Jinbo Xu
| had already done and barely beat him at that year's CASP.
| Academics were leading the way until very, very recently,
| it's not like the problem was stalled for decades.
|
| Thankfully, the funding cuts will continue until research
| improves, and after decades of inflation cutting into
| grants, we are being rewarded by funding cuts to almost
| every major funding body this year. I pledge allegiance
| to the flag!
|
| EDIT: Basically, if you know any scientists, you know the
| vast majority of us work for years with little
| consideration for profit because we care about the
| science and its social impact. It's grating for the
| community, after being treated worse every year, to then
| see all the final credit go to people or companies like
| Eric Lander and Google. Then everyone has to start over,
| pick some new niche that everyone thinks is impossible,
| only to worry about losing it when someone begins to get
| it to work.
| iknowstuff wrote:
| Why haven't the academics created a non profit foundation
| with open source models like this then? If alphabet
| doesnt provide much, then they will be supplanted by non
| profits. I see nothing broken here.
| firefoxbrower wrote:
| Individual labs somehow manage to do that and we're all
| grateful. Martin Steinegger's lab put out ColabFold,
| RELION is the gold standard for cryo-EM despite being
| academic software and the development of more recent
| industry competitors like cryoSPARC. Everything out of
| the IPD is free for academic use. Someone has to fight
| like hell to get all those grants, though, and from a
| societal perspective, it's basically needlessly redundant
| work.
|
| My frustrations aren't with a lack of open source models,
| some poor souls make them. My disagreement is with the
| perception that academia has insufficient incentive to
| work on socially important problems. Most such problems
| are ONLY worked on in academia until they near the finish
| line. Look at Omar Yaghi's lab's work on COFs and MOFs
| for carbon/emission sequestration and atmospheric water
| harvesting. Look at all the thankless work numerous labs
| did on CRISPR-Cas9 before the Broad Institute even
| touched it. Look at Jinbo Xu's work, on David Baker's
| lab's and the IPD's work, etc. Look at what labs first
| solved critical amyloid structures, infuriatingly
| recently, considering the massive negative social impacts
| of neurodegenerative diseases.
|
| It's only rational for companies that only care about
| their own profit maximization to socialize R&D costs and
| privatize any possible gains. This can work if companies
| aren't being run by absolute ghouls who aren't delaying
| the release of a new generation of drugs to minimize
| patent duration overlap or who aren't trying to push
| things that don't work for short-term profit. This can
| also work if we properly fund and credit publicly funded
| academic labs. This is not what's happening, however,
| instead public funded research is increasingly demeaned,
| defunded, and dismantled due to the false impression that
| nothing socially valuable gets done without a profit
| motive. It's okay, though, I guess under this kind of LSC
| worldview, that everything always corrects itself so
| preempting problems doesn't matter, we'll finally learn
| how much actual innovation is publicly funded when we get
| the Minions movie, aducanumab, and WeWork over and over
| again for a few decades while strangling the last bit of
| nature we have left.
| j-wags wrote:
| I work at Open Force Field [1] which is the kind of
| nonprofit that I think you're talking about. Our sister
| project, OpenFold [2], is working on open source versions
| of AlphaFold.
|
| We're making good progress but it's difficult to
| interface with fundamentally different organizational
| models between academia and industry. I'm hoping that
| this model will become normalized in the future. But it
| takes serious leaps of faith from all involved
| (professors, industry leaders, grant agencies, and - if I
| can flatter myself - early career scientists) to leave
| the "safe route" in their organizations and try something
| like this.
|
| [1] https://openforcefield.org/ [2] https://openfold.io/
| Jerrrry wrote:
| The second amendment prevents the government's overreaching
| perversion to restrict me from having the ability to print
| biological weapons from the comfort of my couch.
|
| Google has no such restriction.
| SubiculumCode wrote:
| /s is strong with this one
| gameman144 wrote:
| I know this is tongue in cheek, but you absolutely can be
| restricted from having a biological weapons factory in your
| basement (similar to not being able to pick "nuclear bombs"
| as your arms to bear).
| timschmidt wrote:
| Seems like the recipe for independence, and agreed upon
| borders, and thus whatever interpretation of the second
| amendment one wants involves exactly choosing nuclear
| bombs, and managing to stockpile enough of them before
| being bombed oneself. At least at the nation state scale.
| Sealand certainly resorted to arms at several points in
| it's history.
| gameman144 wrote:
| The second amendment only applies to the United States --
| it's totally normal to have one set of rights for
| citizens and another set for the government itself.
| dekhn wrote:
| Sergey once said "We don't have an army per-se" (he was
| referring the size of Google's physical security group) at
| TGIF.
|
| There was a nervous chuckle from the audience.
| rolph wrote:
| in other words, this has been converted to a novelty, and has
| no use for scientific purposes.
| ebiester wrote:
| No. It just means that scientific purposes will have an
| additional tax paid to google. This will likely reduce use in
| academia but won't deter pharmaceutical companies.
| mhrmsn wrote:
| Also no commercial use, from the paper:
|
| > AlphaFold 3 will be available as a non-commercial usage only
| server at https://www.alphafoldserver.com, with restrictions on
| allowed ligands and covalent modifications. Pseudocode
| describing the algorithms is available in the Supplementary
| Information. Code is not provided.
| moralestapia wrote:
| How easy/hard would be for the scientific community to come
| up with an "OpenFold" model which is pretty much AF3 but
| fully open source and without restrictions in it?
|
| I can image training will be expensive, but I don't think it
| will be at a GPT-4 level of expensive.
| dekhn wrote:
| already did it, https://openfold.io/
| https://github.com/aqlaboratory/openfold
| https://www.biorxiv.org/content/10.1101/2022.11.20.517210v1
| https://lupoglaz.github.io/OpenFold2/
| https://www.biospace.com/article/releases/openfold-
| biotech-a...
|
| I really have to emphasize that transformers have literally
| transformed science in only a few years. Truly
| extraordinary.
| moralestapia wrote:
| Oh, nice! Thanks for sharing.
| p3opl3 wrote:
| Yes, because that's going to stop competitors.. it's why they
| didn't release code I guess.
|
| This is yet another large part of a biotech related Gutenberg
| moment.
| natechols wrote:
| The DeepMind team was essentially forced to publish and
| release an earlier iteration of AlphaFold after the Rosetta
| team effectively duplicated their work and published a
| paper about it in Science. Meanwhile, the Rosetta team just
| published a similar work about co-folding ligands and
| proteins in Science a few weeks ago. These are hardly the
| only teams working in this space - I would expect progress
| to be very fast in the next few years.
| dekhn wrote:
| How much has changed- I talked with David Baker at CASP
| around 2003 and he said at the time, while Rosetta was
| the best modeller, every time they updated its models
| with newly determined structures, its predictions got
| worse :)
| natechols wrote:
| It's kind of amazing in retrospect that it was possible
| to (occasionally) produce very good predictions 20 years
| ago with at least an order of magnitude smaller training
| set. I'm very curious whether DeepMind has tried trimming
| the inputs back to an earlier cutoff point and re-
| training their models - assuming the same computing
| technologies were available, how well would their methods
| have worked a decade or two ago? Was there an inflection
| point somewhere?
| pantalaimon wrote:
| What's the point in that that - I mean who does non-
| commercial drug research?
| sangnoir wrote:
| Academia
| karencarits wrote:
| Public universities?
| obmelvin wrote:
| If you need to submit to their server, I don't know who would
| use it for commercial reasons anyway. Most biotech startups
| and pharma companies are very careful about entering
| sequences into online tools like this.
| ranger_danger wrote:
| Not just unfortunate, but doesn't this make it completely
| untrustable? How can you be sure the data was not modified in
| any way? How can you verify any results?
| dekhn wrote:
| You determine a crystal structure of a known protein which
| does not previously have a known structure, and compare the
| prediction to the experimentally determined structure.
|
| There is a biennial (biannual?) competition known as CASP
| where some new structures, not yet published, are used for
| testing predictions from a wide range of protein structure
| prediction (so, basically blind predictions which are then
| compared when the competition wraps up). AlphaFold beat all
| the competitors by a very wide margin (much larger than the
| regular rate of improvement in the competition), and within a
| couple years, the leading academic groups adopted the same
| techniques and caught up.
|
| It was one of the most important and satisfying moments in
| structure prediction in the past two+ decades. The community
| was a bit skeptical but as it's been repeatedly tested,
| validated, and reproduced, people are generally of the
| opinion that DeepMind "solved" protein structure prediction
| (with some notable exceptions), and did so without having the
| solve the full "protein folding problem" (which is actually
| great news while also being somewhat depressing).
| ranger_danger wrote:
| By data I meant between the client and server, nothing
| actually related to how the program itself works, but just
| the fact that it's controlled by a proprietary third party.
| tepal wrote:
| Or OpenFold, which is the more literal reproduction of
| AlphaFold 2: https://github.com/aqlaboratory/openfold
| LarsDu88 wrote:
| Time for an OpenFold3? Or would it be an OpenFold2?
| photochemsyn wrote:
| Well, it's because you can design deadly viruses using this
| technology. Viruses gain entry to living cells via cell-surface
| receptor proteins whose normal job is to bind signalling
| molecules, alter their conformation and translate that external
| signal into the cellular interior where it triggers various
| responses from genomic transcription to release of other signal
| molecules. Viruses hijack such mechanisms to gain entry to
| cells.
|
| Thus if you can design a viral coat protein to bind to a human
| cell-surface receptor, such that it gets translocated into the
| cell, then it doesn't matter so much where that virus came
| from. The cell's firewall against viruses is the cell membrane,
| and once inside, the biomolecular replication machinery is very
| similar from species to species, particularly within restricted
| domains, such as all mammals.
|
| Thus viruses from rats, mice, bats... aren't going to have
| major problems replicating in their new host - a host they only
| gained access to because some nation-state actors working in
| collaboration on such gain-of-function research in at least two
| labs on opposite sides of the world with funds and material
| provided by the two largest economic powers for reasons that
| are still rather opaque, though suspiciously banal...
|
| Now while you don't _need_ something like AlphaFold3 to do
| recklessly stupid things (you could use directed evolution,
| making millions of mutatad proteins, throwing them at a wall of
| human cell receptors and collecting what stuck), it makes it
| far easier. Thus Google doesn 't want to be seen as enabling,
| though given their prediliction for classified military-
| industrial contracting to a variety of nation-states,
| particularly with AI, with revenue now far more important than
| silly "don't be evil" statements, they might bear watching.
|
| On the positive side, AlphaFold3 will be great for fields like
| small molecular biocatalysis, i.e. industrial applications in
| which protein enzymes (or more robust heterogenous catalysts
| designed based on protein structures) convert N2 to ammonia,
| methane to methanol, or selectively bind CO2 for carbon
| capture, modification of simple sugars and amino acids, etc.
| niemandhier wrote:
| The logical consequence is to put all scientific publications
| under a license that restricts the right to train commercial ai
| models on them.
|
| Science advances because of an open exchange of ideas, the
| original idea of patents was to grant the inventor exclusive
| use in exchange for disclosure of knowledge.
|
| Those who did not patent, had to accept that their inventions
| would be studied and reverse engineered.
|
| The ,,as a service" model, breaks that approach.
| dwroberts wrote:
| This turns it into a tool that deserves to be dethroned by
| another group, frankly. What a strange choice.
| renonce wrote:
| > What is different about the new AlphaFold3 model compared to
| AlphaFold2?
|
| > AlphaFold3 can predict many biomolecules in addition to
| proteins. AlphaFold2 predicts structures of proteins and protein-
| protein complexes. AlphaFold3 can generate predictions containing
| proteins, DNA, RNA, ions,ligands, and chemical modifications. The
| new model also improves the protein complex modelling accuracy.
| Please refer to our paper for more information on performance
| improvements.
|
| AlphaFold 2 generally produces looping "ribbon-like" predictions
| for disordered regions. AlphaFold3 also does this, but will
| occasionally output segments with secondary structure within
| disordered regions instead, mostly spurious alpha helices with
| very low confidence (pLDDT) and inconsistent position across
| predictions.
|
| So the criticism towards AlphaFold 2 will likely still apply? For
| example, it's more accurate for predicting structures similar to
| existing ones, and fails at novel patterns?
| dekhn wrote:
| I am not aware of anybody currently criticiszing AF2's
| abilities outside of its training set. In fact the most recent
| papers (written by crystallographers) they are mostly arguing
| about atomic-level details of side chains at this point.
| COGlory wrote:
| >So the criticism towards AlphaFold 2 will likely still apply?
| For example, it's more accurate for predicting structures
| similar to existing ones, and fails at novel patterns?
|
| Yes, and there is simply no way to bridge that gap with this
| technique. We can make it better and better at pattern
| matching, but it is not going to predict novel folds.
| dekhn wrote:
| alphafold has been shown to accurately predict some novel
| folds. The technique doesn't entirely depend on whole-domain
| homology.
| rolph wrote:
| problem is biomolecules, are "chaperoned" to fold properly,
| only specific regions such as, alpha helix, or beta
| pleatedsheet will fold de novo.
|
| Chaperone (protein)
|
| https://en.wikipedia.org/wiki/Chaperone_(protein)
| tea-coffee wrote:
| This is a basic question, but how is the accuracy of the
| predicted biomolecular interactions measured? Are the predicted
| interactions compared to known interactions? How would the
| accuracy of predicting unknown interactions be assessed?
| joshuamcginnis wrote:
| Accuracy can be assessed two main ways: computationally and
| experimentally. Computationally, they would compare the
| predicted structures and interactions with known data from
| databases like PDB (Protein Database). Experimentally, they can
| use tools like x-ray crystallography and NMR (nuclear magnetic
| resonance) to obtain the actual molecule structure and compare
| it to the predicted result. The outcomes of each approach would
| be fed back into the model for refining future predictions.
|
| https://www.rcsb.org/
| dekhn wrote:
| AlphaFold very explicitly (unless something has changed)
| removes NMR structures as references because they are not
| accurate enough. I have a PhD in NMR biomolecular structure
| and I wouldn't trust. the structures for anything.
| JackFr wrote:
| Sorry, I don't mean to be dense - do you mean you don't
| trust AlphaFolds structures or NMRs?
| dekhn wrote:
| I don't trust NMR structures in nearly all cases. The
| reasons are complex enough that I don't think it's
| worthwhile to discuss on Hacker News.
| fikama wrote:
| Hmm, I would say its always worth to share knowledge.
| Could you paste some links or maybe type a few key-words
| for anyone willing to reasearch the topic further on his
| own.
| dekhn wrote:
| Read this, and recursively (breadth-first) read all its
| transitive references: https://www.sciencedirect.com/scie
| nce/article/pii/S096921262...
| fabian2k wrote:
| Looking at the supplementary material (section 2.5.4) for
| the AlphaFold 3 paper it reads to me like they still use
| NMR structures for training, but not for evaluating
| performance of the model.
| dekhn wrote:
| I think it's implicit in their description of filtering
| the training set, where they say they only include
| structures with resolution of 9A or less. NMR structures
| don't really have a resolution, that's more specific to
| crystallography. However, I can't actually verify that no
| NMR structures were included without directly inspecting
| their list of selected structures.
| fabian2k wrote:
| I think it is very plausible that they don't use NMR
| structures here, but I was looking for a specific
| statement on it in the paper. I think your guess is
| plausible, but I don't think the paper is clear enough
| here to be sure about this interpretation.
| dekhn wrote:
| Yes, thanks for calling that out. In verifying my
| statement I actually was confused because you can see
| they filter NMR out of the eval set (saying so
| explicitly) but don't say that in the test set section
| (IMHO they should be required to publish the actual
| selection script so we can inspect the results).
| fabian2k wrote:
| Hmm, in the earlier AlphaFold 2 paper they state:
|
| > Input mmCIFs are restricted to have resolution less
| than 9 A. This is not a very restrictive filter and only
| removes around 0.2% of structures
|
| NMR structures are more than 0.2% so that doesn't fit to
| the assumption that they implicitly remove NMR structures
| here. But if I filter by resolution on the PDB homepage
| it does remove essentially all NMR structures. I'm really
| not sure what to think here, the description seems too
| soft to know what they did exactly.
| panabee wrote:
| interesting observation and experience. must have made
| thesis development complex, assuming the realization dawned
| on you during the phd.
|
| what do you trust more than NMR?
|
| AF's dependence on MSAs also seems sub-optimal; curious to
| hear your thoughts?
|
| that said, it's understandable why they used MSAs, even if
| it seems to hint at winning CASP more than developing a
| generalizable model.
|
| arguably, MSA-dependence is the wise choice for early
| prediction models as demonstrated by widespread accolades
| and adoption, i.e., it's an MVP with known limitations as
| they build toward sophisticated approaches.
| dekhn wrote:
| My realizations happened after my PhD. When I was writing
| my PhD I still believed we would solve the protein
| folding and structure prediction problems using classical
| empirical force fields.
|
| It wasn't until I started my postdocs, where I started
| learning about protein evolutionary relationships (and
| competing in CASP), that I changed my mind. I wouldn't
| say it so much as "multiple sequence alignments"; those
| are just tools to express protein relationships in a
| structured way.
|
| If Alphafold now, or in the future, requires no
| evolutionary relationships based on sequence (uniprot)
| and can work entirely by training on just the proteins in
| PDB (many of which _are_ evoutionarily related) and still
| be able to predict novel folds, it will be very
| interesting times. The one thing I have learned is that
| evolutionary knowledge makes many hard problems really
| easy, because you 're taking advantage of billions of
| years of nature and an easy readout.
| heyoni wrote:
| Nice to see you on this thread as well! :)
| dopylitty wrote:
| This reminds me of Google's claim that another "AI" discovered
| millions of new materials. The results turned out to be a lot of
| useless noise but that was only apparent after actual expert
| spent hundreds of hours reviewed the results[0]
|
| 0: https://www.404media.co/google-says-it-discovered-
| millions-o...
| dekhn wrote:
| The alphafold work has been used across the industry
| (successfully, in the sense of blind prediction), and has been
| replicated independently. The work on alphafold will likely net
| Demis and John a Nobel prize in the next few years.
|
| (that said, one should always inspect Google publications with
| a fine-toothed comb and lots of skepticism, as they have a
| tendency to juice the results)
| 11101010001100 wrote:
| Depending on your expected value of quantum computing, the
| Nobel committee shouldn't wait too long.
| dekhn wrote:
| Personally I don't expect QC to be a competitor to ML in
| protein structure prediction for the foreseeable future.
| After spending more money on molecular dynamics than
| probably any other human being, I'm really skeptical that
| physical models of protein structures will compete with ML-
| based approaches (that exploit homology and other protein
| sequence similarities).
| nybsjytm wrote:
| >The alphafold work has been used across the industry
| (successfully, in the sense of blind prediction), and has
| been replicated independently.
|
| This is clearly an overstatement, or at least very
| incomplete. See for instance
| https://www.nature.com/articles/s41592-023-02087-4:
|
| "In many cases, AlphaFold predictions matched experimental
| maps remarkably closely. In other cases, even very high-
| confidence predictions differed from experimental maps on a
| global scale through distortion and domain orientation, and
| on a local scale in backbone and side-chain conformation. We
| suggest considering AlphaFold predictions as exceptionally
| useful hypotheses."
| dekhn wrote:
| Yep, I know Paul Adams (used to work with him at Berkeley
| Lab) and that's exactly the paper he'd publish. If you read
| that paper carefully (as we all have, since it's the
| strongest we've seen from the crystallography community so
| far) they're basically saying the results from AF are
| absolutely excellent, and fit for purpose.
|
| (put another way: if Paul publishes a paper saying your
| structure predictions have issues, and mostly finds tiny
| local issues and some distortion and domain orientation,r
| ather than absolutely incorrect fold prediction, it means
| your technique works really well, and people are just
| quibbling about details.)
| nybsjytm wrote:
| I don't know Paul Adams, so it's hard for me to know how
| to interpret your post. Is there anything else I can read
| that discusses the accuracy of AlphaFold?
| dekhn wrote:
| Yes, https://predictioncenter.org/casp15/ https://www.sci
| encedirect.com/science/article/pii/S0959440X2...
| https://dasher.wustl.edu/bio5357/readings/oxford-
| alphafold2....
|
| I can't find the link at the moment but from the
| perspective of the CASP leaders, AF2 was accurate enough
| that it's hard to even compare to the best structures
| determined experimentally, due to noise in the
| data/inadequacy of the metric.
|
| A number of crystallographers have also reported that the
| predictions helped them find errors in their own crystal-
| determined structures.
|
| If you're not really familiar enough with the field to
| understand the papers above, I recommend spending more
| time learning about the protein structure prediction
| problem, and how it relates to the epxerimental
| determination of structure using crystallography.
| nybsjytm wrote:
| Thanks, those look helpful. Whenever I meet someone with
| relevant PhDs I ask their thoughts on AlphaFold, and I've
| gotten a wide variety of responses, from responses like
| yours to people who acknowledge its usefulness but are
| rather dismissive about its ultimate contribution.
| dekhn wrote:
| The people who are most likely to deprecate AlphaFold are
| the ones whose job viability is directly affected by its
| existence.
|
| Let me be clear: DM only "solved" (and really didn't
| "solve") a subset of a much larger problem: creating a
| highly accurate model of the process by which real
| proteins adopt their folded conformations, or how some
| proteins don't adopt folded conformations without
| assistance, or how some proteins don't adopt a fully
| rigid conformation, or how some proteins can adopt
| different shapes in different conditions, or how enzymes
| achieve their catalyst abilities, or how structural
| proteins produce such rigid structures, or how to predict
| whether a specific drug is going to get FDA approval and
| then make billions of dollars.
|
| In a sense we got really lucky because CASP has been
| running so long and with some many contributors that it
| became recognized that winning at CASP meant "solving
| protein structure prediction to the limits of our ability
| to evaluate predictions", and that Demis and his
| associates had such a huge drive to win competitions that
| they invested tremendous resources and state of the art
| technology, while sharing enough information that the
| community could reproduce the results in their own hands.
| Any problem we want solved, we should gamify, so that
| DeepMind is motivated to win the game.
| panabee wrote:
| this is very astute, not only about deepmind but about
| science and humanity overall.
|
| what CASP did was narrowly scope a hard problem, provided
| clear rules and metrics for evaluating participants, and
| offered a regular forum in which candidates can showcase
| skills -- they created a "game" or competition.
|
| in doing so, they advanced the state of knowledge
| regarding protein structure.
|
| how can we apply this to cancer and deepen our
| understanding?
|
| specifically, what parts of cancer can we narrowly scope
| that are still broadly applicable to a complex
| heterogenous disease and evaluate with objective metrics?
|
| [edited to stress the goal of advancing cancer knowledge,
| not to "gamify" cancer science but to create structures
| that inivte more ways to increase our understanding of
| cancer.]
| natechols wrote:
| I also worked with the same people (and share most of the
| same biases) and that paper is about as close to a
| ringing endorsement of AlphaFold as you'll get.
| Laaas wrote:
| > We have yet to find any strikingly novel compounds in the
| GNoME and Stable Structure listings, although we anticipate
| that there must be some among the 384,870 compositions. We also
| note that, while many of the new compositions are trivial
| adaptations of known materials, the computational approach
| delivers credible overall compositions, which gives us
| confidence that the underlying approach is sound.
|
| Doesn't seem outright useless.
| _xerces_ wrote:
| A video summary of why this research is important:
| https://youtu.be/Mz7Qp73lj9o?si=29vjdQtTtIOk_0CV
| ProllyInfamous wrote:
| Thanks for this informative video summary. As a layperson, with
| a BS in Chemistry, it was quite helpful in understanding main
| bulletpoints of this accomplishment.
| moconnor wrote:
| Stepping back, the high-order bit here is an ML method is beating
| physically-based methods for _accurately_ predicting the world.
|
| What happens when the best methods for computational fluid
| dynamics, molecular dynamics, nuclear physics are all
| uninterpretable ML models? Does this decouple progress from our
| current understanding of the scientific process - moving to
| better and better models of the world _without_ human-
| interpretable theories and mathematical models / explanations?
| Is that even iteratively sustainable in the way that scientific
| progress has proven to be?
|
| Interesting times ahead.
| cgearhart wrote:
| This is a neat observation. Slightly terrifying, but still
| interesting. Seems like there will also be cases where we
| discover new theories through the uninterpretable models--much
| easier and faster to experiment endlessly with a computer.
| fnikacevic wrote:
| I can only hope the models will be sophisticated enough and
| willing to explain their reasoning to us.
| thomasahle wrote:
| > Stepping back, the high-order bit here is an ML method is
| beating physically-based methods for accurately predicting the
| world.
|
| I mean, it's just faster, no? I don't think anyone is claiming
| it's a more _accurate_ model of the universe.
| Jerrrry wrote:
| Collision libraries and fluid libraries have had baked-in
| memorized look-up tables that were generated with ML methods
| nearly a decade ago.
|
| World is still here, although the Matrix/metaverse is
| becoming more attractive daily.
| xanderlewis wrote:
| It depends whether the value of science is human understanding
| or pure prediction. In some realms (for drug discovery, and
| other situations where we just need _an answer_ and know what
| works and what doesn't), pure prediction is all we really need.
| But if we could build an uninterpretable machine learning model
| that beats any hand-built traditional 'physics' model, would it
| really be physics?
|
| Maybe there'll be an intermediate era for a while where ML
| models outperform traditional analytical science, but then
| eventually we'll still be able to find the (hopefully limited
| in number) principles from which it can all be derived. I don't
| think we'll ever find that Occam's razor is no use to us.
| failTide wrote:
| > But if we could build an uninterpretable machine learning
| model that beats any hand-built traditional 'physics' model,
| would it really be physics?
|
| At that point I wonder if it would be possible to feed that
| uninterpretable model back into another model that makes
| sense of it all and outputs sets of equations that humans
| could understand.
| gmarx wrote:
| The success of these ML models has me wondering if this is
| what Quantum Mechanics is. QM is notoriously difficult to
| interpret yet makes amazing predictions. Maybe wave functions
| are just really good at predicting system behavior but don't
| reflect the underlying way things work.
|
| OTOH, Newtonian mechanics is great at predicting things under
| certain circumstances yet, in the same way, doesn't
| necessarily reflect the underlying mechanism of the system.
|
| So maybe philosophers will eventually tell us the distinction
| we are trying to draw, although intuitive, isn't real
| kolinko wrote:
| That's what thermodynamics is - we initially only had laws
| about energy/heat flow, and only later we figured out how
| statistical particle movements cause these effects.
| RandomLensman wrote:
| Pure prediction is only all we need if the total end-to-end
| process is predicted correctly - otherwise there could be
| pretty nasty traps (e.g., drug works perfectly for the target
| disease but does something unexpected elsewhere etc.).
| gus_massa wrote:
| > _e.g., drug works perfectly for the target disease but
| does something unexpected elsewhere etc._
|
| That's very common. It's the reason to test the new drug in
| petri dish, then rats, then dogs, then humans and if all
| test passed send it to the pharmacy.
| ozten wrote:
| Science has always given us better, but error prone tooling to
| see further and make better guesses. There is still a
| scientific test. In a clinical trial, is this new drug safe and
| effective.
| nexuist wrote:
| As a steelman, wouldn't the abundance of infinitely generate-
| able situations make it _easier_ for us to develop strong
| theories and models? The bottleneck has always been data. You
| have to do expensive work in the real world and accurately
| measure it before you can start fitting lines to it. If we were
| to birth an e.g. atomically accurate ML model of quantum
| physics, I bet it wouldn't take long until we have mathematical
| theories that explain why it works. Our current problem is that
| this stuff is super hard to manipulate and measure.
| moconnor wrote:
| Maybe; AI chess engines have improved human understanding of
| the game very rapidly, even though humans cannot beat
| engines.
| alfalfasprout wrote:
| This is an important aspect that's being ignored IMO.
|
| For a lot of problems, currently you either don't have an an
| analytical solution and the alternative is a brute force-ish
| numerical approach. As a result the computational cost of
| simulating things enough times to be able to detect behavior
| that can inform theories/models (potentially yielding a good
| analytical result) is not viable.
|
| In this regard, ML models are promising.
| CapeTheory wrote:
| Many of our existing physical models can be decomposed into
| "high-confidence, well tested bit" plus "hand-wavy empirically
| fitted bit". I'd like to see progress via ML replacing the
| empirical part - the real scientific advancement then becomes
| steadily reducing that contribution to the whole by improving
| the robust physical model incrementally. Computational
| performance is another big influence though. Replacing the
| whole of a simulation with an ML model might still make sense
| if the model training is transferrable and we can take
| advantage of the GPU speed-ups, which might not be so easy to
| apply to the foundational physical model solution. Whether your
| model needs to be verified against real physical models depends
| on the seriousness of your use-case; for nuclear weapons and
| aerospace weather forecasts I imagine it will remain essential,
| while for a lot of consumer-facing things the ML will be good
| enough.
| jononor wrote:
| Physics-informed machine learning is a whole (nascent)
| subfield that is very much in line with this thinking. Steve
| Brunton has some good stuff about this on YouTube.
| jncfhnb wrote:
| These processes are both beyond human comprehension because
| they contain vast layers of tiny interactions and also not
| practical to simulate. This tech will allow for exploration for
| accurate simulations to better understand new ideas if needed.
| tomrod wrote:
| A few things:
|
| 1. Research can then focus on where things go wrong
|
| 2. ML models, despite being "black boxes," can still have
| brute-force assessment performed of the parameter space over
| covered and uncovered areas by input information
|
| 3. We tend to assume parsimony (i.e Occam's razor) to give
| preference to simpler models when all else is equal. More
| complex black-box models exceeding in prediction let us know
| the actual causal pathway may be more complex than simple
| models allow. This is okay too. We'll get it figured out. Not
| everything is closed-form, especially considering quantum
| effects may cause statistical/expected outcomes instead of
| deterministic outcomes.
| kylebenzle wrote:
| That is not a real concern, just a confusion on how statistics
| works :(
| timschmidt wrote:
| There will be an iterative process built around curated
| training datasets - continually improved, top tier models,
| teams reverse engineering the model's understanding and
| reasoning, and applying that to improve datasets and training.
| adw wrote:
| > What happens when the best methods for computational fluid
| dynamics, molecular dynamics, nuclear physics are all
| uninterpretable ML models?
|
| A better analogy is "weather forecasting".
| jeffreyrogers wrote:
| I asked a friend of mine who is chemistry professor at a large
| research university something along these lines a while ago. He
| said that so far these models don't work well in regions where
| either theory or data is scarce, which is where most progress
| happens. So he felt that until they can start making progress
| in those areas it won't change things much.
| mensetmanusman wrote:
| Major breakthroughs happen when clear connections can be made
| and engineered between the many bits of solved but obscured
| solutions.
| bbor wrote:
| This is exactly how the physicists felt at the dawn of quantum
| physics - the loss of meaningful human inquiry to blindly
| effective statistics. Sobering stuff...
|
| Personally, I'm convinced that human reason is less pure than
| we think it to be, and that the move to large mathematical
| models might just be formalizing a lack-of-control that was
| always there. But that's less of a philosophy of science
| discussion and more of a cognitive science one
| krzat wrote:
| We will get better with understanding black boxes, if a model
| can be compressed into simple math formula then it's both
| easier to understand and to compute.
| ldoughty wrote:
| My argument is: weather.
|
| I think it is fine & better for society to have applications
| and models for things we don't fully understand... We can model
| lots of small aspects of weather, and we have a lot of factors
| nailed down, but not necessarily all the interactions.. and not
| all of the factors. (Additional example for the same reason:
| Gravity)
|
| Used responsibly. Of course. I wouldn't think an AI model
| designing an airplane that no engineers understand how it works
| is a good idea :-)
|
| And presumably all of this is followed by people trying to
| understand the results (expanding potential research areas)
| GaggiX wrote:
| It would be cool to see an airplane made using generative
| design.
| tech_buddha wrote:
| How about spaceship parts ?
| https://www.nasa.gov/technology/goddard-tech/nasa-turns-
| to-a...
| t14n wrote:
| A new-ish field of "mechanistic interpretability" is trying to
| poke at weights and activations and find human-interpretable
| ideas w/in them. Making lots of progress lately, and there are
| some folks trying to apply ideas from the field to Alphafold 2.
| There are hopes of learning the ideas about biology/molecular
| interactions that the model has "discovered".
|
| Perhaps we're in an early stage of Ted Chiang's story "The
| Evolution of Human Science", where AIs have largely taken over
| scientific research and a field of "meta-science" developed
| where humans translate AI research into more human-
| interpretable artifacts.
| philip1209 wrote:
| It makes me think about how Einstein was famous for making
| falsifiable real-world predictions to accompany his theoretical
| work. And, sometimes it took years for proper experiments to be
| run (such as measuring a solar eclipse during the breakout of a
| world war).
|
| Perhaps the opportunity here is to provide a quicker feedback
| loop for theory about predictions in the real world. Almost
| like unit tests.
| HanClinto wrote:
| > Perhaps the opportunity here is to provide a quicker
| feedback loop for theory about predictions in the real world.
| Almost like unit tests.
|
| Or jumping the gap entirely to move towards more self-driven
| reinforcement learning.
|
| Could one structure the training setup to be able to design
| its own experiments, make predictions, collect data, compare
| results, and adjust weights...? If that loop could be closed,
| then it feels like that would be a very powerful jump indeed.
|
| In the area of LLMs, the SPAG paper from last week was very
| interesting on this topic, and I'm very interested in seeing
| how this can be expanded to other areas:
|
| https://github.com/Linear95/SPAG
| goggy_googy wrote:
| Agreed. At the very least, models of this nature let us
| iterate/filter our theories a little bit more quickly.
| jprete wrote:
| The model isn't reality. A theory that disagrees with the
| model but agrees with reality shouldn't be filtered, but in
| this process it will be.
| mnky9800n wrote:
| I believe it simply tells us that our understanding of
| mechanical systems, especially chaotic ones, is not as well
| defined as we thought.
|
| https://journals.aps.org/prresearch/abstract/10.1103/PhysRev...
| thelastparadise wrote:
| The ML models will help us understand that :)
| jes5199 wrote:
| every time the two systems disagree, it's an opportunity to
| learn something. both kinds of models can be improved with new
| information, done through real-world experiments
| ogogmad wrote:
| Some machine learning models might be more interpretable than
| others. I think the recent "KAN" model might be a step forward.
| dekhn wrote:
| If you're a scientist who works in protein folding (or one of
| those other areas) and strongly believe that science's goal is
| to produce falsifiable hypotheses, these new approaches will be
| extremely depressing, especially if you aren't proficient
| enough with ML to reproduce this work in your own hands.
|
| If you're a scientist who accepts that probabilist models beat
| interpretable ones (articulated well here:
| https://norvig.com/chomsky.html), then you'll be quite happy
| because this is yet another validation of the value of
| statistical approaches in moving our ability to predict the
| universe forward.
|
| If you're the sort of person who believes that human brains are
| capable of understanding the "why" of how things work in all
| its true detail, you'll find this an interesting challenge- can
| we actually interpret these models, or are human brains too
| feeble to understand complex systems without sophisticated
| models?
|
| If you're the sort of person who likes simple models with as
| few parameters as possible, you're probably excited because
| developing more comprehensible or interpretable models that
| have equivalent predictive ability is a very attractive
| research subject.
|
| (FWIW, I'm in the camp of "we should simultaneously seek
| simpler, more interpretable models, while also seeking to
| improve native human intelligence using computational
| augmentation")
| narrator wrote:
| What if our understanding of the laws of the natural sciences
| are subtly flawed and AI just corrects perfectly for our
| flawed understanding without telling us what the error in our
| theory was?
|
| Forget trying to understand dark matter. Just use this model
| to correct for how the universe works. What is actually wrong
| with our current model and if dark matter exists or not or
| something else is causing things doesn't matter. "Shut up and
| calculate" becomes "Shut up and do inference."
| dekhn wrote:
| All models are wrong, but some models are useful.
| RandomLensman wrote:
| High accuracy could result from pretty incorrect models.
| When and where that woukd then go completely off the rails
| is difficult to say.
| visarga wrote:
| ML is accustomed with the idea that all models are bad, and
| there are ways to test how good or bad they are. It's all
| approximations and imperfect representations, but they can
| be good enough for some applications.
|
| If you think carefully humans operate in the same regime.
| Our concepts are all like that - imperfect, approximative,
| glossing over some details. Our fundamental grounding and
| test is survival, an unforgiving filter, but lax enough to
| allow for anti-vaxxer movements during the pandemic -
| survival test is not testing for truth directly, only for
| ideas that fail to support life.
| mistermann wrote:
| Also lax enough for the _hilarious_ mismanagement of the
| situation by "the experts". At least anti-vaxxers have
| an excuse.
| croniev wrote:
| I'm in the following camp: It is wrong to think about the
| world or the models as "complex systems" that may or may not
| be understood by human intelligence. There is no meaning
| beyond that which is created by humans. There is no 'truth'
| that we can grasp in parts but not entirely. Being unable to
| understand these complex systems means that we have framed
| them in such a way (f.e. millions of matrix operations) that
| does not allow for our symbol-based, causal reasoning mode.
| That is on us, not our capabilities or the universe.
|
| All our theories are built on observation, so these empirical
| models yielding such useful results is a great thing - it
| satisfies the need for observing and acting. Missing
| explainability of the models merely means we have less
| ability to act more precisely - but it does not devalue our
| ability to act coarsely.
| visarga wrote:
| But the human brain has limited working memory and
| experience. Even in software development we are often
| teetering at the edge of the mental power to grasp and
| relate ideas. We have tried so much to manage complexity,
| but real world complexity doesn't care about human
| capabilities. So there might be high dimensional problems
| where we simply can't use our brains directly.
| jvanderbot wrote:
| A human mind is perfectly capable of following the same
| instructions as the computer did. Computers are stupidly
| simple and completely deterministic.
|
| The concern is about "holding it all in your head", and
| depending on your preferred level of abstraction, "all"
| can perfectly reasonably be held in your head. For
| example: "This program generates the most likely outputs"
| makes perfect sense to me, even if I don't understand
| some of the code. I understand the _system_. Programmers
| went through this decades ago. Physicists had to do it
| too. Now, chemists I suppose.
| GenerocUsername wrote:
| This is just wrong.
|
| While computer operations in solutions are computable by
| humans, the billions of rapid computations are
| unachievable by humans. In just a few seconds, a computer
| can perform more basic arithmetic operations than a human
| could in a lifetime.
| jvanderbot wrote:
| I'm not saying it's achievable, I'm saying it's not
| magic. A chemist who wishes to understand what the model
| is doing can get as far as anyone else, and can reach a
| level of "this prediction machine works well and I
| understand how to use and change it". Even if it requires
| another PhD in CS.
|
| That the tools became complex is not a reason to fret in
| science. No more than statistical physics or quantum
| mechanics or CNN for image processing - it's complex and
| opaque and hard to explain but perfectly reproduceable.
| "It works better than my intuition" is a level of
| sophistication that most methods are probably doomed to
| achieve.
| ajuc wrote:
| Abstraction isn't the silver bullet. Not everything is
| abstractable.
|
| "This program generates the most likely outputs" isn't a
| scientific explanation, it's teleology.
| jvanderbot wrote:
| "this tool works better than my intuition" absolutely is
| science. "be quiet and calculate" is a well worn mantra
| in physics is it not?
| mistermann wrote:
| What is an example of something that isn't abstractable?
| slibhb wrote:
| > There is no 'truth' that we can grasp in parts but not
| entirely.
|
| If anyone actually thought this way -- no one does -- they
| definitely wouldn't build models like this.
| EventH- wrote:
| "There is no 'truth' that we can grasp in parts but not
| entirely."
|
| The value of pi is a simple counterexample.
| Invictus0 wrote:
| > There is no 'truth' that we can grasp in parts but not
| entirely
|
| It appears that your own comment is disproving this
| statement
| divbzero wrote:
| There have been times in the past when usable technology
| surpassed our scientific understanding, and instead of being
| depressing it provided a map for scientific exploration. For
| example, the steam engine was developed by engineers in the
| 1600s/1700s (Savery, Newcomen, and others) but thermodynamics
| wasn't developed by scientists until the 1800s (Carnot,
| Rankine, and others).
| jprete wrote:
| I think the various contributors to the invention of the
| steam engine had a good idea of what they were trying to do
| and how their idea would physically work. Wikipedia lists
| the prerequisites as the concepts of a vacuum and pressure,
| methods for creating a vacuum and generating steam, and the
| piston and cylinder.
| exe34 wrote:
| That's not too different from the alpha fold people
| knowing that there's a sequence to sequence translation,
| that an enormous number of cross-talk happens between the
| parts of the molecule, that if you get the potential
| fields just right, it'll fold in the way nature intended.
| They're not just blindly fiddling with a bunch of levers.
| What they don't know is the individual detailed
| interactions going on and how to approximate them with
| analytical equations.
| interroboink wrote:
| > ... and strongly believe that science's goal is to produce
| falsifiable hypotheses, these new approaches will be
| extremely depressing
|
| I don't quite understand this point -- could you elaborate?
|
| My understanding is that the ML model produces a hypothesis,
| which can then be tested via normal scientific method
| (perform experiment, observe results).
|
| If we have a magic oracle that says "try this, it will work",
| and then we try it, and it works, we still got something
| falsifiable out of it.
|
| Or is your point that we won't necessarily have a
| coherent/elegant explanation for _why_ it works?
| dekhn wrote:
| People will be depressed because they spent decades getting
| into professorship positions and publishing papers with
| ostensible comprehensible interpretations of the generative
| processes that produced their observations, only to be
| "beat" in the game by a system that processed a lot of
| observations and can make predicts in a way that no
| individual human could comprehend. And those professors
| will have a harder time publishing, and therefore getting
| promoted, in the future.
|
| Whether ML models produce hypotheses is something of an
| epistemiological argument that I think muddies the waters
| without bringing any light. I would only use the term "ML
| models generate predictions". In a sense, the model itself
| is the hypothesis, not any individual prediction.
| variadix wrote:
| There is an issue scientifically. I think this point was
| expressed by Feynman: the goal of scientific theories isn't
| just to make better predictions, it's to inform us about
| how and why the world works. Many ancient civilizations
| could accurately predict the position of celestial bodies
| with calendars derived from observations of their period,
| but it wasn't until Copernicus proposed the heliocentric
| model and Galileo provided supporting observations that we
| understood the why and how, and that really matters for
| future progress and understanding.
| interroboink wrote:
| I agree the how/why is the main driving goal. That's
| kinda why I feel like this is _not_ depressiong news --
| there 's a new frontier to discover and attempt to
| explain. Scientists love that stuff (:
|
| Knowing how to predict the motion of planets but without
| having an underlying explanation encourages scientists to
| develop their theories. Now, once more, we know how to
| predict something (protein folding) but without an
| underlying explanation. Hurray, something to investigate!
|
| (Aside: I realize that there are also more human factors
| at play, and upsetting the status quo will always cause
| some grief. I just wanted to provide a counterpoint that
| there is some exciting progress represented here, too).
| variadix wrote:
| I was mainly responding to the claim that these black
| boxes produce a hypothesis that is useful as a basis for
| scientific theories. I don't think it does, because it
| offers no explanation as to the how and why, which is as
| we agree the primary goal. It doesn't provide a
| hypothesis per se, just a prediction, which is useful
| technologically and should indicate that there is more to
| be discovered (see my response to the sibling reply)
| scientifically but offers no motivating explanation.
| Invictus0 wrote:
| But we do know why, it's just not simple. The atoms
| interact with one another because of a variety of
| fundamental forces, but since there can be hundreds of
| thousands of atoms in a single protein, it's plainly
| beyond human comprehension to explain why it folds the
| way it does, one fundamental force interaction at a time.
| variadix wrote:
| Fair. I guess the interesting thing for protein folding
| research then is that there appears to be a way to
| approximate/simplify the calculations required to predict
| folding patterns that doesn't require the precision of
| existing folding models and software. In essence,
| AlphaFold is an existence proof that there should be a
| way to model protein folding more efficiently.
| coffeemug wrote:
| _> If you 're the sort of person who believes that human
| brains are capable of understanding the "why" of how things
| work in all its true detail_
|
| This seems to me an empirical question about the world. It's
| clear our minds are limited, and we understand complex
| phenomena through abstraction. So either we discover we can
| continue converting advanced models to simpler abstractions
| we can understand, or that's impossible. Either way, it's
| something we'll find out and will have to live with in the
| coming decades. If it turns out further abstractions aren't
| possible, well, enlightenment thought had lasted long enough.
| It's exciting to live at a time in humanity's history when we
| enter a totally uncharted new paradigm.
| jprete wrote:
| The goal of science has always been to discover underlying
| principles and not merely to predict the outcome of
| experiments. I don't see any way to classify an opaque ML
| model as a scientific artifact since by definition it can't
| reveal the underlying principles. Maybe one could claim the
| ML model itself is the scientist and everyone else is just
| feeding it data. I doubt human scientists would be
| comfortable with that, but if they aren't trying to explain
| anything, what are they even doing?
| fire_lake wrote:
| What if the underlying principles of the universe are too
| complex for human understanding but we can train a model
| that very closely follows them?
| dekhn wrote:
| Then we should dedicate large fractions of human
| engineering towards finding ethical ways to improve human
| intelligence so that we can appreciate the underlying
| principles better.
| refulgentis wrote:
| I spend about 30 minutes reading this thread and links
| from it: I don't really follow your line of argument. I
| find it fascinating and well-communicated, the lack of
| understanding is on me: my attention flits around like a
| butterfly, in a way that makes it hard for me to follow
| people writing original content.
|
| High level, I see a distinction between theory and
| practice, between an oracle predicting without
| explanation, and a well-thought out theory built on a
| partnership between theory and experiment over centuries,
| ex. gravity.
|
| I have this feeling I can't shake that the knife you're
| using is too sharp, both in the specific example we're
| discussing, and in general.
|
| In the specific example, folding, my understanding is we
| know how proteins fold & the mechanisms at work. It just
| takes an ungodly amount of time to compute and you'd
| still confirm with reality anyway. I might be completely
| wrong on that.
|
| Given that, the proposal to "dedicate...engineer[s]
| towards finding ethical ways to improve...intelligence so
| that we can appreciate the underlying principles better"
| begs the question of if we're not appreciating the
| underlying principles.
|
| It feels like a close cousin of physics
| theory/experimentalist debate pre-LHC, circa 2006: the
| experimentalists wanted more focus on building colliders
| or new experimental methods, and at the extremes, thought
| string theory was a complete was of time.
|
| Which was working towards appreciating the underlying
| principles?
|
| I don't really know. I'm not sure there's a strong divide
| between the work of recording reality and explaining it.
| I'll peer into a microscope in the afternoon, and take a
| shower in the evening, and all of a sudden, free
| associating gives me a more high-minded explanation for
| what I saw.
|
| I'm not sure a distinction exists for protein folding,
| yes, I'm virtually certain this distinction does not
| exist in reality, only in extremely stilted examples
| (i.e. a very successful oracle at Delphi)
| mistermann wrote:
| There's a much easier route: consciousness is not
| included in the discussion...what a coincidence.
| Wilduck wrote:
| That sounds like useful engineering, but not useful
| science.
| mrbungie wrote:
| I think that a lot of scientific discoveries originate
| from initial observations made during engineering work or
| just out of curiosity without rigour.
|
| Not saying ML methods haven't shown important
| reproducibility challenges, but to just shut them down
| due to not being "useful science" is inflexible.
| dekhn wrote:
| That's the aspirational goal. And I would say that it's a
| bit of an inflexible one- for example, if we had an ML that
| could generate molecules that cure diseases that would pass
| FDA approval, I wouldn't really care if scientists couldn't
| explain the underlying principles. But I'm an ex-scientist
| who is now an engineer, because I care more about tools
| that produce useful predictions than understanding
| underlying principles. I used to think that in principle we
| could identify all the laws of the universe, and in theory,
| simulate that would enough accuracy, and inspect the
| results, and gain enlightenment, but over time, I've
| concluded that's a really bad way to waste lots of time,
| money, and resources.
| panarky wrote:
| It's not either-or, it's yes-and. We don't have to
| abandon one for the other.
|
| AlphaFold 3 can rapidly reduce a vast search space in a
| way physically-based methods alone cannot. This narrowly
| focused search space allows scientists to apply their
| rigorous, explainable, physical methods, which are slow
| and expensive, to a small set of promising alternatives.
| This accelerates drug discovery and uncovers insights
| that would otherwise be too costly or time-consuming.
|
| The future of science isn't about AI versus traditional
| methods, but about their intelligent integration.
| nextos wrote:
| Or you can treat AlphaFold as a black box / oracle and
| work at systems biology level, i.e. at pathway and
| cellular level. Protein structures and interactions are
| always going to be hard to predict with interpretable
| models, which I also prefer.
|
| My only worry is that AlphaFold and others, e.g. ESM,
| seem to be bit fragile for out-of-distribution sequences.
| They are not doing a great job with unusual sequences, at
| least in my experience. But hopefully they will improve
| and provide better uncertainty measures.
| exe34 wrote:
| The ML model can also be an emulator of parts of the system
| that you don't want to personally understand, to help you
| get on with focusing on what you do want to figure out.
| Alternatively, the ML model can pretend to be the real
| world while you do experiments with it to figure out
| aspects of nature in minutes rather than hours-days of
| biological turnaround.
| strogonoff wrote:
| Can underlying principles be discovered using the framework
| of scientific method? The primary goal of models and
| theories it develops is to support more experiments and
| eventually be disproven. If no model can be correct,
| complete and provable in finite time, then a theory about
| underlying principles that claims completeness would have
| to be unfalsifiable. This is reasonable in context of
| philosophy, but not in natural sciences.
|
| Scientific method can help us rule out what underlying
| principles are definitely _not_. Any such principles are
| not actually up to be "discovered".
|
| If probabilistic ML comes along and does a decent job at
| predicting things, we should keep in mind that those
| predictions are made not in context of absolute truth, but
| in context of theories and models we have previously
| developed. I.e., it's not just that it can predict how
| molecules interact, but that the entire concept of
| molecules is an artifact of just some model we (humans)
| came up with previously--a model which, per above, is
| probably incomplete/incorrect. (We could or should use this
| prediction to improve our model or come up with a better
| one, though.)
|
| Even if a future ML product could be creative enough to
| actually come up with and iterate on models all on its own
| from first principles, it would not be able to give us the
| answer to the question of underlying principles for the
| above-mentioned reasons. It could merely suggest us another
| incomplete/incorrect model; to believe otherwise would be
| to ascribe it qualities more fit for religion than science.
| jltsiren wrote:
| I don't find that argument convincing.
|
| People clearly have been able to discover many underlying
| principles using the scientific method. Then they have
| been able to explain and predict many complex phenomena
| using the discovered principles, and create even more
| complex phenomena based on that. Complex phenomena such
| as the technology we are using for this discussion.
|
| Words dont have any inherent meaning, just the meaning
| they gain from usage. The entire concept of truth is an
| artifact of just some model (language) we came up with
| previously--a model which, per above, is probably
| incomplete/incorrect. The kind of absolute truth you are
| talking about may make sense when discussing philosophy
| or religion. Then there is another idea of truth more
| appropriate for talking about the empirical world. Less
| absolute, less immutable, less certain, but more
| practical.
| ak_111 wrote:
| Discovering underlying principles and predicting outcomes
| is two sides of the same coin in that there is no way to
| confirm you have discovered underlying principles unless
| they have some predictive power.
|
| Some had tried to come up with other criteria to confirm
| you have discovered an underlying principle without
| predictive power, such as on aesthetics - but this is seen
| by the majority of scientists as basically a cop out. See
| debate around string theory.
|
| Note that this comment is summarizing a massive debate in
| the philosophy of science.
| thfuran wrote:
| >there is no way to confirm you have discovered
| underlying principles unless they have some predictive
| power.
|
| Yes, but a perfect oracle has no explanatory power, only
| predictive.
| nkingsy wrote:
| increasing the volume of predictions produces patterns
| that often lead to underlying principles.
| mikeyouse wrote:
| And much of the 20th century was characterized by a very
| similar progression - we had no clue what the actual
| mechanism of action was for hundreds of life saving drugs
| until relatively recently, and we still only have best
| guesses for many.
|
| That doesn't diminish the value that patients received in
| any way even though it would be more satisfying to make
| predictions and design something to interact in a way
| that exactly matches your theory.
| chasd00 wrote:
| If all you can do is predict an outcome without being
| able to explain how then what have you really discovered?
| Asking someone to just believe you can predict outcomes
| without any reasoning as to how, even if you're always
| right, sounds like the concept of faith in religion.
| pas wrote:
| it's still an extremely valuable tool. just as we see in
| mathematics, closed forms (and short and elegant proofs)
| are much coveted luxury items.
|
| for many basic/fundamental mathematical objects we don't
| (yet) have simple mechanistic ways to compute them.
|
| so if a probabilistic model spits out something very
| useful, we can slap a nice label on it and call it a day.
| that's how engineering works anyway. and then hopefully
| someday someone will be able to derive that result from
| "first principles" .. maybe it'll be even more
| funky/crazy/interesting ... just like mathematics
| arguably became more exciting by the fact that someone
| noticed that many things are not provable/constructable
| without an explicit Axiom of Choice.
|
| https://en.wikipedia.org/wiki/Nonelementary_integral#Exam
| ple...
| thfuran wrote:
| >closed forms (and short and elegant proofs) are much
| coveted luxury items.
|
| Yes, but we're taking about roughly the opposite of a
| proof
| jcims wrote:
| Isn't that basically true of most of the fundamental laws
| of physics? There's a lot we don't understand about
| gravity, space, time, energy, etc., and yet we compose
| our observations of how they behave into very useful
| tools.
| dumpsterdiver wrote:
| > what have you really discovered?
|
| You've discovered magic.
|
| When you read about a wizard using magic to lay waste to
| invading armies, how much value would you guess the
| armies place in whether or not the wizard truly
| understands the magic being used against them?
|
| Probably none. Because the fact that the wizard doesn't
| fully understand why magic works does not prevent the
| wizard from using it to hand invaders their asses.
| Science is very much the same - our own wizards used
| medicine that they did not understand to destroy invading
| hordes of bacteria.
| toxik wrote:
| Kepler famously compiled troves of data on the night sky,
| and just fitted some functions to them. He could not
| explain why but he could say what. Was he not a scientist?
| Invictus0 wrote:
| Maybe the science of the past was studying things of lesser
| complexity than the things we are studying now.
| SJC_Hacker wrote:
| What if it turns out that nature simply doesn't have nice,
| neat models that humans can comprehend for many observable
| phenomena?
| gradus_ad wrote:
| That ship sailed with Quantum physics. Nearly perfect at
| prediction, very poor at giving us a concrete understanding
| of what it all means.
|
| This has happened before. Newtonian mechanics was
| incomprehensible spooky action at a distance, but Einstein
| clarified gravity as the bending of spacetime.
| pishpash wrote:
| So the work to simplify ML models, reduce dimensions, etc.
| becomes the numeric way to seek simple actual scientific
| models. Scientific computing and science become one.
| ThomPete wrote:
| The goal of science should always be to seek good
| explanations hard to vary.
| RajT88 wrote:
| > can we actually interpret these models, or are human brains
| too feeble to understand complex systems without
| sophisticated models?
|
| I think we will have to develop a methodology and supporting
| toolset to be able to derive the underlying patterns driving
| such ML models. It's just too much for a human to comb
| through by themselves and make sense of.
| tobrien6 wrote:
| I suspect that ML will be state-of-the-art at generating human-
| interpretable theories as well. Just a matter of time.
| sdwr wrote:
| > Does this decouple progress from our current understanding of
| the scientific process?
|
| Thank God! As a person who uses my brain, I think I can say,
| pretty definitively, that people are bad at understanding
| things.
|
| If this actually pans out, it means we will have harnessed
| knowledge/truth as a fundamental force, like fire or
| electricity. The "black box" as a building block.
| tantalor wrote:
| This type of thing is called an "oracle".
|
| We've had stuff like this for a long time.
|
| Notable examples:
|
| - Temple priestesses
|
| - Tea-leaf reading
|
| - Water scrying
|
| - Palmistry
|
| - Clairvoyance
|
| - Feng shui
|
| - Astrology
|
| The only difference is, the ML model is really quite good at
| it.
| unsupp0rted wrote:
| > The only difference is, the ML model is really quite good
| at it.
|
| That's the crux of it: we've had theories of physics and
| chemistry since before writing was invented.
|
| None of that mattered until we came upon the ones that
| actually work.
| insane_dreamer wrote:
| For me the big question is how do we confidently validate the
| output of this/these model(s).
| topaz0 wrote:
| It's the right question to ask, and the answer is that we
| will still have to confirm them by experimental structure
| determination.
| tambourine_man wrote:
| Our metaphors and intuitions were crumbling already and
| stagnating. See quantum physics: sometimes a particle,
| sometimes a wave, and what constitute a measurement anyway?
|
| I'll take prediction over understanding if that's the best our
| brains can do. We've evolved to deal with a few orders of
| magnitude around a meter and a second. Maybe dealing with
| light-years and femtometer/seconds is too much to ask.
| dyauspitr wrote:
| Whatever it is if we needed to we could follow each instruction
| through the black box. It's never going to be as opaque as
| something organic.
| wslh wrote:
| This is the topic of epistemology of the sciences in books such
| as "New Direction in the Philosophy of Mathematics" [1] and
| happened before with problems such as the four color theorem
| [2] where AI was not involved.
|
| Going back to the uninterpretable ML models in the context of
| AlphaFold 3, I think one method for trying to explain the
| findings is similar to the experimental methods of physics with
| reality: you perform experiments with the reality (in this case
| AlphaFold 3) to came up with sound conclusions. AI/ML is an
| interesting black-box system.
|
| There are other open discussions on this topic. For example,
| can our human brain absorbe that knowledge or it is limited
| somehow with the scientific language that we have now?
|
| [1]
| https://www.google.com.ar/books/edition/New_Directions_in_th...
|
| [2] https://en.wikipedia.org/wiki/Four_color_theorem
| torrefatto wrote:
| You are conflating the whole scientific endeavor to a very
| specific problem to which this specific approach is effective
| at producing results that fit with the observable world. This
| has nothing to do with science as a whole.
| scotty79 wrote:
| We should be thankful that we live in the universe that obeys
| math simple enough to comprehend that we were able to reach
| that level.
|
| Imagine if optis was complex enough that it would require ML
| model to predict anything.
|
| We'd be in permanent stone age without a way out.
| lupire wrote:
| What would a universe look like that lacked simple things,
| and somehow only complex things existed?
|
| It makes me think of how Gaussian integers have irreducibles
| but not prime numbers, where some large things cannot be
| uniquely expressed as combination of smaller things.
| mberning wrote:
| I would assume that given enough hints from AI and if it is
| deemed important enough humans will come in to figure out the
| "first principles" required to arrive at the conclusion.
| RobCat27 wrote:
| I believe this is the case also. With a well enough
| performing AI/ML/probabilistic model where you can change the
| model's input parameters and get a highly accurate prediction
| basically instantly, we can test theories approximately and
| extremely fast rather than running completely new
| experiments, which will always come with it's own set of
| errors and problems.
| danielmarkbruce wrote:
| "better and better models of the world" does not always mean
| "more accurate" and never has.
|
| We already know how to model the vast majority of things, just
| not at a speed and cost which makes it worthwhile. There are
| dimensions of value - one is accuracy, another speed, another
| cost, and in different domains additional dimensions. There are
| all kinds of models used in different disciplines which are
| empirical and not completely understood. Reducing things to the
| lowest level of physics and building up models from there has
| never been the only approach. Biology, geology, weather,
| materials all have models which have hacks in them, known
| simplifications, statistical approximations, so the result can
| be calculated. It's just about choosing the best hacks to get
| the best trade off of time/money/accuracy.
| Gimpei wrote:
| Might be easier to come up with new models with analytic
| solutions if you have a probabilistic model at hand. A lot
| easier to evaluate against data and iterate. Also, I wouldn't
| be surprised if we develop better tools for introspecting these
| models over time.
| UniverseHacker wrote:
| It means we now have an accurate surrogate model or "digital
| twin" that can be experimented on almost instantaneously. So we
| can massively accelerate the traditional process of developing
| mechanistic understanding through experiment, while _also_
| immediately be able to benefit from the ability to make
| accurate predictions, even without needing understanding.
|
| In reality, science has already pretty much gone this way long
| ago, even if people don't like to admit it. Simple,
| reductionist explanations for complex phenomena in living
| systems don't really exist. Virtually all of medicine nowadays
| is empirical: try something, and if you can prove its safe and
| effective, you keep doing it. We almost never have a meaningful
| explanation for how it really works, and when we think we do,
| it gets proven wrong repeatedly, while the treatment keeps
| working as always.
| mathgradthrow wrote:
| instead of "in mice", we'll be able to say "in the cloud"
| unsupp0rted wrote:
| In vivo in humans in the cloud
| dekhn wrote:
| one of the companies I worked for, "insitro", is
| specificallyt named that to mean the combination of "in
| vivo, in vitro, in silicon".
| topaz0 wrote:
| "In nimbo" (though what people actually say is "in
| silico").
| d_silin wrote:
| "in silico"
| imchillyb wrote:
| Medicine can be explained fairly simply, and the why of how
| it works as it does is also explained by this:
|
| Imagine a very large room that has every surface covered by
| on-off switches.
|
| We cannot see inside of this room. We cannot see the
| switches. We cannot fit inside of this room, but a toddler
| fits through the tiny opening leading into the room. The
| toddler cannot reach the switches, so we equip the toddler
| with a pole that can flip the switches. We train the toddler,
| as much as possible, to flip a switch using the pole.
|
| Then, we send the toddler into the room and ask the toddler
| to flip the switch or switches we desire to be flipped, and
| then do tests on the wires coming out of the room to see if
| the switches were flipped correctly. We also devise some
| tests for other wires to see if that naughty toddler flipped
| other switches on or off.
|
| We cannot see inside the room. We cannot monitor the toddler.
| We can't know what _exactly_ the toddler did inside the room.
|
| That room is the human body. The toddler with a pole is a
| medication.
|
| We can't see or know enough to determine what was activated
| or deactivated. We can invent tests to narrow the scope of
| what was done, but the tests can never be 100% accurate
| because we can't test for every effect possible.
|
| We introduce chemicals then we hope-&-pray that the chemicals
| only turned on or off the things we wanted turned on or off.
| Craft some qualifications testing for proofs, and do a 'long-
| term' study to determine if there were other things turned on
| or off, or a short circuit occurred, or we broke something.
|
| I sincerely hope that even without human understanding, our
| AI models can determine what switches are present, which ones
| are on and off, and how best to go about selecting for the
| correct result.
|
| Right now, modern medicine is almost a complete crap-shoot.
| Hopefully modern AI utilities can remedy the gambling aspect
| of medicine discovery and use.
| tnias23 wrote:
| I wonder if ML can someday be employed in deciphering such
| black box problems; a second model that can look under the hood
| at all the number crunching performed by the predictive model,
| identify the pattern that resulted in a prediction, and present
| it in a way we can understand.
|
| That said, I don't even know if ML is good at finding patterns
| in data.
| lupire wrote:
| > That said, I don't even know if ML is good at finding
| patterns in data.
|
| That's the only thing ML does.
| burny_tech wrote:
| We need to advance mechanistic interpretability (field reverse
| engineering neural networks)
| https://www.youtube.com/watch?v=P7sjVMtb5Sg
| https://www.youtube.com/watch?v=7t9umZ1tFso
| https://www.youtube.com/watch?v=2Rdp9GvcYOE
| goggy_googy wrote:
| I think at some point, we will be able to produce models that
| are able to pass data into a target model and observe its
| activations and outputs and put together some interpretable
| pattern or loose set of rules that govern the input-output
| relationship in the target model. Using this on a model like
| AlphaFold might enable us to translate inferred chemical laws
| into natural language.
| pen2l wrote:
| The most moneyed and well-coordinated organizations have honed
| a large hammer, and they are going to use it for everything,
| and so almost certainly future big findings in the areas you
| mention, probabilistically inclined models coming from ML will
| be the new gold standard.
|
| But yet the only thing that can save us from ML will be ML
| itself because it is ML that has the best chance to be able to
| extrapolate patterns from these blackbox models to develop
| human interpretable models. I hope we do dedicate explicit
| effort to this endeavor, and so continue the human advances and
| expanse of human knowledge in tandem with human ingenuity with
| computers at our assistance.
| optimalsolver wrote:
| Spoiler: "Interpretable ML" will optimize for output that
| either looks plausible to humans, reinforces our
| preconceptions, or appeals to our aesthetic instincts. It
| will not converge with reality.
| kolinko wrote:
| That is not considered interpretable then, and I think most
| people working in the field are aware of this gotcha.
|
| Iirc when EU required banks to have interpretable rules for
| loans, a plain explanation was not considered enough. What
| was required was a clear process that was used from the
| beginning - i.e. you can use an AI to develop an algorightm
| to make a decision, but you can't use AI to make a decision
| and explains reasons afterwards.
| DoctorOetker wrote:
| Spoiler: basic / hard sciences describe nature
| mathematically.
|
| Open a random physics book, and you will find lots and lots
| of derivations (using more or less acceptable assumptions
| depending on circumstance under consideration).
|
| Derivations and assumptions can be formally verified, see
| for example https://us.metamath.org
|
| Ever more intelligent machine learning algorithms and data
| structures replacing human heuristic labor, will simply
| shift the expected minimum deliverable from associations to
| ever more rigorous proofs in terms of less and less
| assumptions.
|
| Machine learning will ultimately be used as automated
| theorem provers, and their output will eventually be
| explainable by definition.
|
| When do we classify an explanation as explanatory? When it
| succeeds in deriving a conclusion from acceptable
| assumptions without hand waving. Any hand waving would
| result in the "proof" not having passed formal
| verification.
| thegrim33 wrote:
| Reminds me of the novel Blindsight - in it there's special
| individuals who work as synthesists, whos job it is to observe
| and understand and then somehow translate back to "lay person"
| the seemingly undecipherable actions/decisions of advanced
| computers and augmented humans.
| 6gvONxR4sf7o wrote:
| "Best methods" is doing a lot of heavy lifting here. "Best" is
| a very multidimensional thing, with different priorities
| leading to different "bests." Someone will inevitably
| prioritize reliability/accuracy/fidelity/interpretability, and
| that's probably going to be a significant segment of the
| sciences. Maybe it's like how engineers just need an
| approximation that's predictive enough to build with, but
| scientists still want to understand the underlying phenomena.
| There will be an analogy to how some people just want an opaque
| model that works on a restricted domain for their purposes, but
| others will be interested in clearer models or
| unrestricted/less restricted domain models.
|
| It could lead to a very interesting ecosystem of roles.
|
| Even if you just limit the discussion to using the best model
| of X to design a better Y, limited to the model's domain of
| validity, that might translate the usage problem to finding
| argmax_X of valueFunction of modelPrediction of design of X. In
| some sense a good predictive model is enough to solve this with
| brute force, but this still leaves room for tons of fascinating
| foundational work. Maybe you start to find that the (wow so
| small) errors in modelPrediction are correlated with
| valueFunction, so the most accurate predictions don't make it
| the best for argmax (aka optimization might exploit model
| errors rather than optimizing the real thing). Or maybe brute
| force just isn't computationally feasible, so you need to
| understand something deeper about the problem to simplify the
| optimization to make it cheap.
| RandomLensman wrote:
| We could be entering a new age of epicycles - high accuracy but
| very flawed understanding.
| advisedwang wrote:
| In physics, we already deal with the fact that many of the core
| equations cannot be analytically solved for more than the most
| basic scenarios. We've had to adapt to using approximation
| methods and numerical methods. This will have to be another
| place where we adapt to a practical way of getting results.
| topaz0 wrote:
| In case it's not clear, this does not "beat" experimental
| structure determination. The matches to experiment are pretty
| close, but they will be closer in some cases than others and
| may or may not be close enough to answer a given question about
| the biochemistry. It certainly doesn't give much information
| about the dynamics or chemical perturbations that might be
| relevant in biological context. That's not to pooh-pooh
| alphafold's utility, just that it's a long way from making
| experimental structure determination unnecessary, and much much
| further away from replacing a carefully chosen scientific
| question and careful experimental design.
| bluerooibos wrote:
| > What happens when...
|
| I can only assume that existing methods would still be used for
| verification. At least we understand the logic used behind
| these methods. The ML models might become more accurate on
| average but they could still throw out results that are way off
| occasionally, so their error rate would have to become equal to
| the existing methods.
| GistNoesis wrote:
| The frontier in model space is kind of fluid. It's all about
| solving differential equations.
|
| In theoretical physics, you know the equations, you solve
| equations analytically, but you can only do that when the model
| is simple.
|
| In numerical physics, you know the equations, you discretize
| the problem on a grid, and you solve the constraint defined by
| the equations with various numerical integration schemes like
| RK4, but you can only do that when the model is small and you
| know the equations, and you find a single solution.
|
| Then you want the result faster, so you use mesh-free methods
| and adaptive grids. It works on bigger models but you have to
| know the equations, finding a single solution to the
| differential equations.
|
| Then you compress this adaptive grid with a neural network,
| while still knowing the governing equations, and you have
| things like Physics Informed Neural Networks (
| https://arxiv.org/pdf/1711.10561 and following papers) where
| you can bound the approximation error. This method allows solve
| all solutions to the differential equations simultaneously,
| sharing the computations.
|
| Then when knowing explicitly your governing equations is too
| complex, so you assume that there are some governing stochastic
| equations implicitly, which you learn the end-result of the
| dynamic with a diffusion model, that's what this alpha-fold is
| doing.
|
| ML is kind of a memoization technique, analog to hashlife in
| the game of life, that allows you reuse your past computational
| efforts. You are free to choose on this ladder which memory-
| compute trade-off you want to use to model the world.
| visarga wrote:
| No, science doesn't work that way. You can just calculate your
| way to scientific discoveries, you got to test them in the real
| world. Learning, both in humans and AI, is based on the signals
| provided by the environment. There are plenty of things not
| written anywhere, so the models can't simply train on human
| text to discover new things. They learn directly from the
| environment to do that, like AlphaZero did when it beat humans
| at Go.
| slibhb wrote:
| It's interesting to compare this situation to earlier eras in
| science. Newton, for example, gave us equations that were very
| accurate but left us with no understanding at all of _why_ they
| were accurate.
|
| It seems like we're repeating that here, albeit with wildly
| different methods. We're getting better models but by giving up
| on the possibility of actually understanding things from first
| principles.
| slashdave wrote:
| Not comparable. Our current knowledge of the physics involved
| in these systems is complete. It is just impossibly difficult
| to calculate from first principles.
| ChuckMcM wrote:
| Interesting times indeed. I think the early history of
| medicines takes away from your observation though. In the 19th
| and early 20th century people didn't know _why_ medicines
| worked, they just did. The whole "try a bunch of things on
| mice, pick the best ones and try them on pigs, and then the
| best of those and try a few on people" kind of thing. In many
| ways the mice were a stand in for these models, at the time
| scientists didn't understand nearly as much about how mice
| worked (early mice models were pretty crude by today's
| standards) but they knew they were a close enough analog to the
| "real thing" that the information provided by mouse studies was
| usefully translated into things that might help/harm humans.
|
| So when you're tools can produce outputs that you find useful,
| you can then use those tools to develop your understanding and
| insights. As a tool, this is quite good.
| aaroninsf wrote:
| The top HN response to this should be,
|
| what happens is an opportunity has entered the chat.
|
| There is a wave coming--I won't try to predict if it's the next
| one--where the hot thing in AI/ML is going to be profoundly
| powerful tools for analyze other such tools and render them
| intelligible to us,
|
| which will I imagine mean providing something like a zoomable
| explainer. At every level there are footnotes; if you want to
| understand why the simplified model is a simplification, you
| look at the fine print. Which has fine print. Which has...
|
| Which doesn't mean there is not a stable level at which some
| formal notion of "accurate" cannot be said to exist, which is
| the minimum viable level of simplification.
|
| Etc.
|
| This sort of thing will of course will the input to many other
| things.
| signal_space wrote:
| Is alphafold doing model generation or is it just reducing a
| massive state space?
|
| The current computational and systems biochemistry approaches
| struggle to model large biomolecules and their interactions due
| to the large degrees of freedom of the models.
|
| I think it is reasonable to rely on statistical methods to lead
| researchers down paths that have a high likelihood of being
| correct versus brute forcing the chemical kinetics.
|
| After all chemistry is inherently stochastic...
| jononor wrote:
| I think it likely that instead of replacing existing methods,
| we will see a fusion. Or rather, many different kinds of
| fusions - depending on the exact needs of the problems at hand
| (or in science, the current boundary of knowledge). If nothing
| else then to provide appropriate/desirable level of
| explainability, correctness etc. Hypothetically the combination
| will also have better predictive performance and be more data
| efficient - but it remains to be seen how well this plays out
| in practice. The field of "physics informed machine learning"
| is all about this.
| Grieverheart wrote:
| Perhaps for understanding the structure itself, but having the
| structure available allows us to focus on a coarser level. We
| also don't want to use quantum mechanics to understand the
| everyday world, and that's why we have classic mechanics etc.
| nico wrote:
| Even if we don't understand the models themselves, you can
| still use them as a basis for understanding
|
| For example, I have no idea how a computer works in every
| minute detail (ie, exactly the physics and chemistry of every
| process that happens in real time), but I have enough of an
| understanding of what to do with it, that I can use it as an
| incredibly useful tool for many things
|
| Definitely interesting times!
| phn wrote:
| I'm not a scientist by any means, but I imagine even accurate
| opaque models can be useful in moving the knowledge forward.
| For example, they can allow you to accurately simulate reality,
| making experiments faster and cheaper to execute.
| GuB-42 wrote:
| We already have the absolute best method for accurately
| predicting the world, and it is by experimentation. In the
| protein folding case, it works by actually making the protein
| and analyzing it. For designing airplanes, computer models are
| no match for building the thing, or even using physical models
| and wind tunnels.
|
| And despite having these "best method", it didn't prevent
| progress in theoretical physics, theory and experimentation
| complement each other.
|
| ML models are just another kind of model that can help both
| engineering and fundamental research. Their working is close to
| the old guy in the shop who knows intuitively what is good
| design, because he has seen it all. That old guys in shops are
| sometimes better than modeling using physics equations help
| scientific progress, as scientists can work together with the
| old guy, combining the strength of intuition and experience
| with that of scientific reasoning.
| jpadkins wrote:
| Hook the protein model up to an LLM model, have the LLM
| interpret the results. Problem solved :-) Then we just have to
| trust the LLM is giving us correct interpretations.
| flawsofar wrote:
| How do they compare on accuracy per watt?
| theGnuMe wrote:
| The models are learning an encoding based on evolutionary
| related and known structures. We should be able to derive
| fundamental properties from those encodings eventually. Or at
| least our biophysical programmed models should map into that
| encoding. That might be a reasonable approach to look at the
| folding energy landscape.
| MobiusHorizons wrote:
| Is it capable of predictions though? Ie can it accurately
| predict the folding of new molecules? Otherwise how do you
| distinguish accuracy from overfitting.
| slashdave wrote:
| In terms of docking, you can call the conventional approaches
| "physically-based", however, they are rather poor physical
| models. Namely, they lack proper electrostatics, and, most
| importantly, basically ignore entropic contributions. There is
| no reason for concern.
| trueismywork wrote:
| To paraphrase Kahan, it's not interesting to me whether a
| method is accurate enough or not, but whether you can predict
| how accurate you can be. So, if ML methods can predict that
| they're right 98% of times then we can build this in our
| systems, even if we don't understand how they work.
|
| Deterministic methods can predict result with a single run, ML
| methods will need ensemble of results to show the same
| confidence. It is possible at the end of day that the
| difference in cost might not he that high over time.
| abledon wrote:
| Next decade we will focus on building out debugging and
| visualization tools for deep learning , to glance inside the
| current black box
| hyperthesis wrote:
| Engineering often precedes Science. It's just more data.
| salty_biscuits wrote:
| I'd say it's not new. Take fluid dynamics as an example, the
| navier stokes equations predict the motion of fluids very well
| but you need to approximately solve them on a computer in order
| to get useful predictions for most setups. I guess the
| difference is the equation is compact and the derivation from
| continuum mechanics is easy enough to follow. People still rely
| on heuristics to answer "how does a wing produce lift?". These
| heuristic models are completely useless at "how much lift will
| this particular wing produce under these conditions?". Seems
| like the same kind of situation. Maybe progress forward will
| look like producing compact models or tooling to reason about
| why a particular thing happened.
| Brian_K_White wrote:
| Perhaps an ai can be made to produce the work as well as a
| final answer, even if it has to reconstruct or invent the work
| backwards rather than explain it's own internal inscrutable
| process.
|
| "produce a process that arrives at this result" should be just
| another answer it can spit out. We don't necessarily care if
| the answer it produces is actually the same as what originally
| happened inside itself. All we need is that the answer checks
| out when we try it.
| JacobThreeThree wrote:
| As a tool people will use it as any other tool, by
| experimenting, testing, tweaking and iterating.
|
| As a scientific theory for fundamentally explaining the nature
| of the universe, maybe it won't be as useful.
| qwertox wrote:
| > Thrilled to announce AlphaFold 3 which can predict the
| structures and interactions of nearly all of life's molecules
| with state-of-the-art accuracy including proteins, DNA and RNA.
| [1]
|
| There's a slight mismatch between the blog's title and Demis
| Hassabis' tweet, where he uses "nearly all".
|
| The blog's title suggests that it's a 100% solved problem.
|
| [1] https://twitter.com/demishassabis/status/1788229162563420560
| bmau5 wrote:
| Marketing vs. Reality :)
| TaupeRanger wrote:
| First time reading a Deep Mind PR? This is literally their
| modus operandi.
| nybsjytm wrote:
| Important caveat: it's only about 70% accurate. Why doesn't the
| press release say this explicitly? It seems intentionally
| misleading to only report accuracy relative to existing methods,
| which apparently are just not so good (30%, 50% in various
| settings). https://www.fastcompany.com/91120456/deepmind-
| alphafold-3-dn...
| bluerooibos wrote:
| That's pretty good. Based on the previous performance
| improvements of Alpha-- models, it'll be nearing 100% in the
| next couple of years.
| nybsjytm wrote:
| Just "Alpha-- models" in general?? That's not a remotely
| reasonable way to reason about it. Even if it were, why
| should it stop DeepMind from clearly communicating accuracy?
| dekhn wrote:
| The way I think about this (specifically, deepmind not
| publishing their code or sharing their exact experimental
| results): advanced science is a game played by the most
| sophisticated actors in the world. Demis is one of those
| actors, and he plays the games those actors play better
| than anybody else I've ever seen. Those actors don't care
| much about the details of any specific system's accuracy:
| they care to know that it's possible to do this, and some
| general numbers about how well it works, and some hints
| what approaches they should take. And Nature, like other
| top journals, is more than willing to publish articles like
| this because they know it stimulates the most competitive
| players to bring their best games.
|
| (I'm not defending this approach, just making an
| observation)
| nybsjytm wrote:
| I think it's important to qualify that the relevant
| "game" is not advanced science per se; the game is
| business whose product is science. The aim isn't to do
| novel science; it's to do something which can be
| advertised as novel science. That isn't to cast
| aspersions on the personal motivations of Hassabis or any
| other individual researcher working there (which itself
| isn't to remove their responsibilities to public
| understanding); it's to cast aspersions on the structure
| that they're part of. And it's not to say that they can't
| produce novel or important science as part of their work
| there. And it's also not to say that the same tension
| isn't often present in the science world - but I think
| it's present to an extreme degree at DeepMind.
|
| (Sometimes the distinction between novel science and
| advertisably novel science is very important, as seems to
| be the case in the "new materials" research dopylitty
| linked to in these comments: here
| https://www.404media.co/google-says-it-discovered-
| millions-o...)
| 7734128 wrote:
| I'm quite hyped for the upcoming BetaFold, or even
| ReleaseCandidateFold models. They just have to be great.
| akira2501 wrote:
| > it'll be nearing 100% in the next couple of years.
|
| What are you basing this on? There is no established "moores
| law" for computational models.
| Aunche wrote:
| IIRC the next best models all have all been using AlphaFold 2's
| methodology, so that's still a massive improvement.
|
| Edit: I see now that you're probably objecting to the headline
| that got edited on HN.
| nybsjytm wrote:
| Not just the headline, the whole press release. And not
| questioning that it's a big improvement.
| j7ake wrote:
| So it's okay now to publish a computational paper with no code? I
| guess Nature's reporting standards don't apply to everyone.
|
| > A condition of publication in a Nature Portfolio journal is
| that authors are required to make materials, data, code, and
| associated protocols promptly available to readers without undue
| qualifications.
|
| > Authors must make available upon request, to editors and
| reviewers, any previously unreported custom computer code or
| algorithm used to generate results that are reported in the paper
| and central to its main claims.
|
| https://www.nature.com/nature-portfolio/editorial-policies/r...
| boxed wrote:
| Are you an editor or reviewer?
| HanClinto wrote:
| Good question.
|
| Also makes me wonder -- where's the line? Is it reasonable to
| have "layperson" reviewers? Is it reasonable to think that
| regular citizens could review such content?
| Kalium wrote:
| I think you will find that for the vast, vast majority of
| scientific papers there is significant negative expected
| value to even attempting to have layperson reviewers. Bear
| in mind that we're talking about papers written by experts
| in a specific field aimed at highly technical communication
| with other people who are experts in the same field. As a
| result, the only people who can usefully review the
| materials are drawn from those who are also experts in the
| same field.
|
| For an instructive example, look up the seminal paper on
| the structure of DNA:
| https://www.mskcc.org/teaser/1953-nature-papers-watson-
| crick... Ask yourself how useful comments from someone who
| did not know what an X-ray is, never mind anything about
| organic chemistry, would be in improving the quality of
| research or quality of communication between experts in
| both fields.
| _just7_ wrote:
| No, infact most journals have peer reviews cordoned off,
| not viewable to the general public.
| lupire wrote:
| That's pre-publication review, not scientific peer
| review. Special interests try to conflate the two, to
| bypass peer review and transform science into a religion.
|
| Peer review properly refers to the general process of
| science advancing by scientists reviewing each other's
| published work.
|
| Publishing a work is the middle, not the end of the
| research.
| j7ake wrote:
| If you read the standards it applies broadly beyond reviewers
| or editors.
|
| > A condition of publication in a Nature Portfolio journal is
| that authors are required to make materials, data, code, and
| associated protocols promptly available to readers without
| undue qualifications.
| dekhn wrote:
| Nature has long been willing to break its own rules to be at
| the forefront of publishing new science.
| ein0p wrote:
| I'm inclined to ignore such pr fluff until they actually
| demonstrate a _practical_ result. Eg. cure some form of cancer or
| some autoimmune disease. All this "prediction of structure" has
| been in the news for years, and it seems to have resulted in
| nothing practically usable IRL as far as I can tell. I could be
| wrong of course, I do not work in this field
| dekhn wrote:
| the R&D of all major pharma is currently using AlphaFold
| predictions when they don't have experimentally determined
| structures. I cannot share further details but the results
| suggest that we will see future pharmaceuticals based on AF
| predictions.
|
| The important thing to recognize is that protein structures are
| primarily hypothesis-generation machines and tools to stimulate
| ideas, rather that direct targets of computational docking.
| Currently structures rarely capture the salient details
| required to identify a molecule that has precisely the
| biological outcome desired, because the biological outcome is
| an extremely complex function that incorporates a wide array of
| other details, such as other proteins, metabolism, and more.
| ein0p wrote:
| Sure. If/when we see anything practical, that'll be the right
| moment to pay attention. This is much like "quantum
| computing" where everyone who doesn't know what it is is
| excited for some reason, and those that do know can't even
| articulate any practical applications
| dekhn wrote:
| Feynman already articulated the one practical application
| for quantum computing: using it to simulate complex systems
| (https://www.optica-
| opn.org/home/articles/on/volume_11/issue_... and
| https://calteches.library.caltech.edu/1976/ and
| https://s2.smu.edu/~mitch/class/5395/papers/feynman-
| quantum-...
|
| These approaches are now being explored but I haven't seen
| any smoking guns showing a QC-based simulation exceeding
| the accuracy of a classical computer for a reasonable
| investment.
|
| Folks have suggested other areas, such as logistics, where
| finding small improvements to the best approximations might
| give a company a small edge, and crypto-breaking, but there
| has been not that much progress in this area, and the
| approximate methods have been improving rapidly.
| arolihas wrote:
| There are a few AI-designed drugs in various phases of clinical
| trials, these things take time.
| mchinen wrote:
| I am trying to understand how accurate the docking predictions
| are.
|
| Looking at the PoseBusters paper [1] they mention, they say they
| are 50% more accurate than traditional methods.
|
| DiffDock, which is the best DL based systems gets 30-70%
| depending on the dataset, and traditional gets 50-70%. The paper
| highlighted some issues with the DL-based methods and given that
| DeepMind would have had time to incorporate this into their work
| and develop with the PoseBusters paper in mind, I'd hope it's
| significantly better than 50-70%. They say 50% better than
| traditional so I expected something like 70-85% across all
| datasets.
|
| I hope a paper will appear soon to illuminate these and other
| details.
|
| [1]
| https://pubs.rsc.org/en/content/articlehtml/2024/sc/d3sc0418...
| dsign wrote:
| For a couple of years I've been expecting that ML models would be
| able to 'accelerate' bio-molecular simulations, using physics-
| based simulations as ground truth. But this seems to be a step
| beyond that.
| dekhn wrote:
| When I competed in CASP 20 years ago (and lost terribly) I
| predicted that the next step to improve predictions would be to
| develop empirically fitted force fields to make MD produce
| accurate structure predictions (MD already uses empirically
| fitted force fields, but they are not great). This area was
| explored, there are now better force fields, but that didn't
| really push protein structure prediction forward.
|
| Another approach is fully differentiable force fields- the idea
| that the force field function itself is a trainable structure
| (rather than just the parameters/weights/constants) that can be
| optimized directly towards a goal. Also explored, produced some
| interesting results, but nothing that woudl be considered
| transformative.
|
| The field still generally believes that if you had a perfect
| force field and infinite computing time, you could directly
| recapitulate the trajectories of proteins folding (from fully
| unfolded to final state along with all the intermediates), but
| that doesn't address any practical problems, and is massively
| wasteful of resources compared to using ML models that exploit
| evolutionary information encoded in sequence and structures.
|
| In retrospect I'm pretty relieved I was wrong, as the new
| methods are more effective with far fewer resources.
| xnx wrote:
| Very cool that anyone can login to
| https://golgi.sandbox.google.com/ and check it out
| bschmidt1 wrote:
| Google's Game of Life 3D: Spiral edition
| uptownfunk wrote:
| Very sad to see they did not make it open source. When you have a
| technology that has the potential to be a gateway for drug
| development, to the cures of new diseases, and instead you choose
| to make it closed, it is a very huge disservice to the community
| at large. Sure, release your own product alongside it, but making
| it closed source does not help the scientific community upon
| which all these innovations were built. Especially if you have
| lost a loved one to a disease which this technology will one day
| be able to create cures for, it is very disappointing.
| falcor84 wrote:
| The closer it gets to enabling full drug discovery, the closer
| it also gets to enabling bioterrorism. Taking it to the
| extreme, if they had the theory of everything, I don't think
| I'd want it to be made available to the whole world as it is
| today.
|
| On a related note, I highly recommend The Talos Principle 2,
| which really made me think about these questions.
| pythonguython wrote:
| Any organization/country that has the ability to use a tool
| like this to create a bio weapon is already sophisticated
| enough to do bioterrorism today.
| ramon156 wrote:
| Alright, but now picture this: it's now open to the masses,
| meaning an individual could probably even do it.
| uptownfunk wrote:
| [delayed]
| LouisSayers wrote:
| Why do you need AI for bioterrorism? There are plenty of well
| known biological organisms that can kill us today...
| nojvek wrote:
| So much hyperbole from recent Google releases.
|
| I wish they didn't hype AI so much, but I guess that's what
| people want to hear, so they say that.
| sangnoir wrote:
| I don't blame them for hyping their products - if only to fight
| the sentiment that Google is far behind OpenAI because they
| were not first to release a LLM.
| LarsDu88 wrote:
| As a software engineer, I kind of feel uncomfortable about this
| new model. It outperforms Alphafold 2 at ligand binding, but
| Alphafold 2 also had some more hardcoded and interpretable
| structural reasoning baked into the model architecture.
|
| There's so many things you can incorporate into a protein folding
| model such as structural constraints, rotational equivariance,
| etc, etc
|
| This new model simple does away with some of that, achieving
| greater results. And the authors simply use distillation from
| data outputted from Alphafold2 and Alphafold2-multimer to get
| those better results for those cases where you wind up with
| implausible results.
|
| You have to run all those previous models, and output their
| predictions to do the distillation to achieve a real end-to-end
| training from scratch for this new model! Makes me feel a bit
| uncomfortable.
| amitport wrote:
| Consider that humans also learn from other humans, and
| sometimes surpass their teachers.
|
| A bit more comfortable?
| Balgair wrote:
| Ahh, but the new young master is able to explain their work
| and processes to the satisfaction of the old masters. In the
| 'Science' of our modern times it's a requirement to show your
| work (yes, yes, I know about the replication crisis and all
| that terrible jazz).
|
| Not being able to ascertain how and why the ML/AI is
| achieving results is not quite the same and more akin to the
| alchemists and sorcerers with their cyphers and hidden
| laboratories.
| falcor84 wrote:
| > the new young master is able to explain their work and
| processes to the satisfaction of the old masters
|
| Yes, but it's one level deep - in general they wouldn't be
| able to explain their work to their master's master (note
| "science advances one funeral at a time").
| sangnoir wrote:
| > Makes me feel a bit uncomfortable.
|
| Why? Do compilers which can't bootstrap themselves also make
| you uncomfortable due to dependencies on pre-built artifacts?
| I'm not saying you're unjustified to feel that way, but
| sometimes more abstracted systems are quicker to build and may
| have better performance than those built from the ground up.
| Selecting which one is better depends on your constraints and
| taste
| roody15 wrote:
| I wonder in the not too distant future if these AI predictions
| could be explained back into "humanized" understanding. Much like
| ChatGPT can simplify complex topics ... cold the model in the
| future provide feedback to researchers why it is making this
| prediction?
| reliablereason wrote:
| Would be very useful if one they used it to predict the structure
| and interaction of the known variants to.
|
| Would be very helpful when predicting if a mutation on a protein
| would lead to loss of function for the protein.
| mfld wrote:
| The improvement on predicting protein/RNA/ligand interactions
| might facilitate many commercially relevant use cases. I assume
| pharma and biotech will eagerly get in line to use this.
| tonyabracadabra wrote:
| Very cool, and what's cooler is this rap about alphafold3
| https://heymusic.ai/blog/news/alphafold-3
| wuj wrote:
| This tool reminds me that the human body functions much like a
| black box. While physics can be modeled with equations and
| constraints, biology is inherently probabilistic and
| unpredictable. We verify the efficacy of a medicine by observing
| its outcomes: the medicine is the input, and the changes in
| symptoms are the output. However, we cannot model what happens in
| between, as we cannot definitively prove that the medicine
| affects only its intended targets. In many ways, much of what we
| understand about medicine is based on observing these black-box
| processes, and this tool helps to model that complexity.
| lysozyme wrote:
| Probably worth mentioning that David Baker's lab released a
| similar model (predicts protein structure along with bound DNA
| and ligands), just a couple of months ago, and it is open source
| [1].
|
| It's also worth remembering that it was David Baker who
| originally came up with the idea of extending AlphaFold from
| predicting just proteins to predicting ligands as well [2].
|
| 1. https://github.com/baker-laboratory/RoseTTAFold-All-Atom
|
| 2. https://alexcarlin.bearblog.dev/generalized/
|
| Unlike AlphaFold 3, which predicts only a small, preselected
| subset of ligands, RosettaFold All Atom predicts a much wider
| range of small molecules. While I am certain that neither network
| is up to the task of designing an enzyme, these are exciting
| steps.
|
| One of the more exciting aspects of the RosettaFold paper is that
| they train the model for predicting structures, but then also use
| the structure predicting model as the denoising model in a
| diffusion process, enabling them to actually design new
| functional proteins. Presumably, DeepMind is working on this
| problem as well.
| theGnuMe wrote:
| And that tech just got $1b in funding.
| refulgentis wrote:
| I appreciated this, but it's probably worth mentioning: when
| you say AlphaFold 3, you're talking about AlphaFold 2.
|
| TFA announces AlphaFold 3.
|
| Post: "Unlike AlphaFold 3, which predicts only a small,
| preselected subset of ligands, RosettaFold All Atom predicts a
| much wider range of small molecules"
|
| TFA: "AlphaFold 3...*models large biomolecules such as
| proteins, DNA and RNA*, as well as small molecules, also known
| as ligands"
|
| Post: "they also use the structure predicting model as the
| denoising model in a diffusion process...Presumably, DeepMind
| is working on this problem as well."
|
| TFA: "AlphaFold 3 assembles its predictions using a diffusion
| network, akin to those found in AI image generators."
| bbstats wrote:
| Zero-shot nearly beating trained catboost is pretty amazing.
| thenerdhead wrote:
| A lot of accelerated article previews as of recently. Seems like
| humanity is making a lot of breakthroughs.
|
| This is nothing short of amazing for all those suffering from
| disease.
| ak_111 wrote:
| If you work in this space would be interested to know what
| _material_ impact has alphafold caused in your workflow since its
| release 4 years ago?
| lumb63 wrote:
| Would anyone more familiar with the field be able to provide some
| cursory resources on the protein folding problem? I have a
| background in computer science and a half a background in biology
| (took two semesters of OChem, biology, anatomy; didn't go much
| further).
| MPSimmons wrote:
| Not sure why the first thing they point it at wouldn't be prions.
| itissid wrote:
| Noob here. Can one make the following deduction:
|
| In transformer based architectures, where one typically uses
| variation of attention mechanism to model interactions, even if
| one does not consider the autoregressive assumption of the
| domain's "nodes"(amino acids, words, image patches), if the
| number of final states that nodes take eventually can be permuted
| only in a finite way(i.e. they have sparse interactions between
| them), then these architectures are efficient way of modeling
| such domains.
|
| In plain english the final state of words in a sentence and amino
| acids in a protein have only so many ways they can be arranged
| and transformers do a good job of modeling it.
|
| Also can one assume this won't do well for domains where there
| is, say, sensitivity to initial conditions, like chaotic systems
| like wheather where the # final states just explodes?
| ricksunny wrote:
| I'm interested in how they measure accuracy of binding site
| identification and binding pose prediction. This was missing for
| the hitherto widely-used binding pose prediction tool Autodock
| Vina (and in silico binding pose tools in general). Despite the
| time I invested in learning & exercising that tool, I avoided
| using it for published research because I could not credibly cite
| its general-use accuracy. Is / will Alphafold 3 be citeable in
| the sense of "I have run Alphafold on this particular target of
| interest and this array of ligands, and have found these poses of
| X kJ/mol binding energy, and this is known to an accuracy of Y%
| because of Alphafold 3's training set results cited below'
| l33tman wrote:
| I've never trusted those predicted binding energies. If you
| have predicted a ligand/protein complex and have high
| confidence in it and want to study the binding energy I really
| think you should do a full MD simulation, you can pull the
| ligand-protein complex apart and measure the change in free
| energy explicitly.
|
| Also, and this is an unfounded guess only, the problem of
| protein / ligand docking is quite a bit more complex than
| protein folding - there seems to be a finite set of overall
| folds used in nature, while docking a small ligand to a big
| protein with flexible sidechains and even flexible large-scale
| structures can have induced fits that are really important to
| know and estimate, and I'm just very sceptical that it's going
| to be possible to in a general fashion ever predict these
| accurately by the AI model with the limited training data.
|
| Though you just need some hints, then you can run MD sims on
| them to see what happens for real.
| TaupeRanger wrote:
| So after 6 years of this "revolutionary technology", what we have
| to show for all the hype and breathless press releases is:
| ....another press release saying how "revolutionary" it is.
| Fantastic. Thanks DeepMind.
| dev1ycan wrote:
| Excited but also it's been a fair bit now and I have yet to see
| something truly remarkable come out of this
| zmmmmm wrote:
| So much of the talk about their "free server" seems to be trying
| to distract from the fact that they are not releasing the model.
|
| I feel like it's an important threshold moment if this gets
| accepted into scientific use without the model being available -
| reproducibility of results becomes dependent on the good graces
| of a single commercial entity. I kind of hope that like OpenAI it
| just spurs creation of equivalent open models that then actually
| get used.
___________________________________________________________________
(page generated 2024-05-08 23:00 UTC)