[HN Gopher] AlphaProteo generates novel proteins for biology and...
       ___________________________________________________________________
        
       AlphaProteo generates novel proteins for biology and health
       research
        
       Author : meetpateltech
       Score  : 202 points
       Date   : 2024-09-05 15:05 UTC (7 hours ago)
        
 (HTM) web link (deepmind.google)
 (TXT) w3m dump (deepmind.google)
        
       | bluehat974 wrote:
       | Two minute papers video on the subject:
       | https://www.youtube.com/watch?v=lI3EoCjWC2E
        
         | deisteve wrote:
         | i studied molecular biology and i couldn't help contain my
         | excitement when it was able to bind to another protein. I dont
         | think HN realizes how huge this is. With this level of
         | accuracy, not only can we understand the full mysteries of
         | ourselves but literally any biological entity.
         | 
         | With that level of understanding, its easy to fabricate special
         | medicines that target specific biochem pathways, but more
         | exciting is that we can literally "code in 3d world". We'll be
         | able to print and grow organs in mass. We'll be able to design
         | structures that will bind to target proteins responsible for
         | certain traits. The potential boon to human medicine will be
         | enormous.
         | 
         | I got like goosebumps after watching that video because I
         | understood the implications of being able to predict folds and
         | now generate proteins that will bind to any protein we
         | choose!!!!
         | 
         | We just might have discovered a panacea of sorts and Demis and
         | his team should receive the Nobel Prize.
         | 
         | I'm just ecstatic that we'll see so much drastic improvement in
         | human medicine and importantly how accessible they will be with
         | this new discovery.
        
           | kridsdale3 wrote:
           | How do you feel about the potential bioterrorism alternate
           | angle of this capability?
        
             | deisteve wrote:
             | it would be like committing terrorism using silicon wafers
             | 
             | you would have to infiltrate an extremely guarded facility
             | 
             | you would somehow have to bypass QA
             | 
             | its not like somebody on the assembly line for a new
             | protein drug sprinkles a dose of PCP
             | 
             | one potential dual use could be somebody modifying a
             | popular fruit with birds and then droppin seeds at the
             | local organic farm fair
             | 
             | and then when those seeds are consumed by birds they
             | produce poop dangerous for other animals to consume
             | 
             | you could absolutely screw around with the ecosystem, like
             | whoever has access to this programmable "bio-wafer" will be
             | able to play god totally undetected.
             | 
             | the problem is that "bio-wafer" manufacturing process will
             | be very tough and regulated like the CNC machines used to
             | manufacture jet engines depriving certain countries from
             | being able to churn out their own jet engines
        
               | mjcohen wrote:
               | Its not regulated if a government does it.
        
           | stanford_labrat wrote:
           | > We'll be able to print and grow organs in mass.
           | 
           | I'm a PhD candidate doing my thesis work on stem cell models
           | and tissue engineering for organ transplant...I think this
           | technology is certainly a large leap forward but I think you
           | are a little overzealous with this claim.
        
           | holoduke wrote:
           | In 2035 everybody can live forever.
        
           | aquafox wrote:
           | > With that level of understanding, its easy to fabricate
           | special medicines that target specific biochem pathways
           | 
           | The problem is to find the right target or pathway in the
           | first place. Just go to opentargets.org. There are lots of
           | potential targets by different metrics but for many diseases
           | we haven't identified that single target that let's us
           | improve the life of say, 20% of patients, for disease X.
        
       | westurner wrote:
       | > _Trained on vast amounts of protein data from the Protein Data
       | Bank (PDB) and more than 100 million predicted structures from
       | AlphaFold, AlphaProteo has learned the myriad ways molecules bind
       | to each other. Given the structure of a target molecule and a set
       | of preferred binding locations on that molecule, AlphaProteo
       | generates a candidate protein that binds to the target at those
       | locations._
        
       | sdenton4 wrote:
       | (not to be confused with AlphaProto, which is helps with Google's
       | core business of turning protocol buffers into differenter
       | protocol buffers.)
        
       | purpleblue wrote:
       | I wonder how many prions will be accidentally created by this, or
       | if it can even predict if a particular protein will have prion-
       | like effects
        
         | DalasNoin wrote:
         | The context here is that prions are misfolded proteins that
         | replicate by causing other proteins to change their
         | configuration into the misfolded form of the prion. Diseases
         | caused by prions include Mad Cow disease, Creutzfeldt-Jakob
         | disease, and Chronic Wasting disease. All prion diseases are
         | incurable and 100% fatal.
        
         | nabla9 wrote:
         | If the protein is a novel it does not matter, because it has no
         | normal variants in the nature.
        
           | connorgutman wrote:
           | Couldn't the underlying tech be applied to non-novel proteins
           | by a bad actor?
        
             | DalasNoin wrote:
             | Someone could fine-tune a model on pairs of existing
             | proteins and their misfolded prions and then ask the system
             | to come up with new prions for other proteins. ChatGPT
             | found these 4 companies that will produce proteins for you
             | just based on digital DNA that you send them:
             | 
             | - Genewiz (Azenta Life Sciences)
             | 
             | - Thermo Fisher Scientific (GeneArt)
             | 
             | - Tierra Biosciences
             | 
             | - NovoPro Labs
        
               | connorgutman wrote:
               | Whelp, time to move to a small island in the middle of
               | the Pacific.
        
               | ben_w wrote:
               | One of the few cases where Mars actually is a decent
               | planet B.
        
         | UniverseHacker wrote:
         | It would be essentially impossible to create a new prion
         | disease by accident- generating random-ish new things with
         | methods like this would pale in comparison to the massive
         | number of weird random-ish things natural biology is already
         | creating in the wild.
         | 
         | However, this category of technologies could potentially be
         | used to develop new prion diseases on purpose. As well as to
         | develop cures for prion diseases that disrupt the misfolding.
        
           | Enginerrrd wrote:
           | >As well as to develop cures for prion diseases that disrupt
           | the misfolding.
           | 
           | That seems quite plausible actually. You'd need something
           | that can target misfolded PrP and bind it up so it can't do
           | anything and then hopefully your targeting protein leaves
           | normal PrP alone. A bit like an antibody.
        
         | brcmthrowaway wrote:
         | Oh this would be 100x worse than the covid lab leak
        
       | eig wrote:
       | Maybe this is in the supplement of the whitepaper [0], but I
       | would have loved to see more analysis of how novel the designed
       | proteins really are.
       | 
       | In the whitepaper they mention that they are novel compared to
       | other in silico design techniques, but to my knowledge other
       | binders to VEGF and Covid spike protein exist and would already
       | be found in the PDB database that Deepmind trained the model on.
       | 
       | This is not to minimize the results- if the history of ML is
       | anything to go by, even if AlphaProteo does not currently beat
       | the best affinity found by in vitro screens, I do not doubt that
       | it soon will!
       | 
       | [0] - https://storage.googleapis.com/deepmind-
       | media/DeepMind.com/B...
        
         | vessenes wrote:
         | They must be somewhat novel in that the wet lab work verified
         | up to 10x stronger binding as predicted. I agree it would be
         | interesting to see how they compare to known binding proteins
        
           | dekhn wrote:
           | we've been able to design tight binders for quite some time
           | now- the issue with synthetic designs is that they tend to
           | bind a little too tightly. You want to have a reasonable off-
           | rate and the ligand protein should do more than just bind, it
           | needs to effect some sort of response from the bound protein.
           | 
           | When you look at these synthetics they often maximize for
           | interactions of hydrophobic areas on the surface.
        
         | gilleain wrote:
         | Might depend on what your measure of 'novelty' is in protein
         | structure. A single residue change (for example) would not
         | normally be considered a novel structure - it's just a
         | mutation.
         | 
         | However, a new fold - that is, the shape that the backbone
         | folds into - would be novel. Potentially also novel would be
         | 'chimeric' structures with parts from other structures, as with
         | chimeric domain swaps.
         | 
         | There was a structure designed by the Baker lab called 'Top7 -
         | https://pubmed.ncbi.nlm.nih.gov/14631033/ that I remember as
         | ground breaking at the time :) (in the ancient days of 2003 it
         | seems ...)
        
           | eig wrote:
           | Exactly. If the proteins suggested in this paper are very
           | similar to known good binders in PDB then I am much less
           | impressed by the results. You could argue they are generating
           | a structure from the training set.
           | 
           | I want more info about how novel these proteins are.
        
       | i_love_limes wrote:
       | I have a question that hopefully a molecular biologist can
       | answer. Can tools like this potentially create protein structures
       | that specifically bind in certain cells? Or is this more about a
       | way of being able to create proteins for genes / structures we
       | haven't been able to before?
       | 
       | I'm very interested in my research at the moment in pleiotropy,
       | namely mapping pleiotropic effects in as many *omics/QTL
       | measurements and complex traits as possible. This is really
       | helpful for determining which genes / proteins to focus on for
       | drug development.
       | 
       | The problem with drugs is in fact pleiotropy! A single protein
       | can do quite a lot of things in your body, either through a
       | causal downstream mechanism (vertical pleiotropy), or seemingly
       | independent processes (horizontal). This limits a lot of possible
       | drug target as the side-effect / detrimental effect may be too
       | large.
       | 
       | So, if these tools can create ultra specific protein structures
       | that somehow _only_ bind in the areas of interest, then that
       | would be a truly massive breakthrough.
        
         | UniverseHacker wrote:
         | Yes, in principle but there are huge limitations and challenges
         | to using a protein as a drug in living organisms. It has to be
         | injected to avoid digestion, and a protein can't just pass into
         | a cell, it needs to get in somehow. Current peptide drugs like
         | insulin are identical to, or closely mimic natural small
         | peptide hormones that bind to receptors on the outside of a
         | cell. However, there is a possibility of using gene therapy to
         | directly express a novel protein drug inside of the cell. A
         | novel protein is also likely to trigger an immune response- so
         | that type of gene therapy is mostly useful when that is
         | actually desired, e.g. as a vaccine.
        
         | Pulcinella wrote:
         | For anyone who would like to know more about designing proteins
         | with a certain function, target, or structure in mind, the term
         | to search for is "rational design."
         | 
         | https://en.m.wikipedia.org/wiki/Rational_design
        
           | ampdepolymerase wrote:
           | Also "off target effects".
        
           | loopdoend wrote:
           | Thank you for this, terms of art are the silent
           | gatekeepers...
        
             | elmomle wrote:
             | As an aside, learning the precise terms for concepts in
             | fields in which I'm a layperson (or simply have some
             | cobwebs to shake loose)--and then exploring those terms
             | more--is something that I've found LLMs extraordinarily
             | useful for.
        
         | highfrequency wrote:
         | Not an expert, but you could imagine a protein with two
         | receptors that are required for activation. One of them binds
         | to a protein that is only present in the cells of interest, and
         | the other one binds to the actual target.
        
         | deisteve wrote:
         | they can generate proteins that bind to specific structures
         | with high accuracy, achieving true cell-specificity and
         | avoiding unwanted pleiotropic effects involves many more
         | variables beyond just protein-protein interactions. These tools
         | are more about expanding our ability to target previously
         | "undruggable" proteins rather than solving the cell-specificity
         | problem outright. however they could be valuable components in
         | developing more targeted therapies when combined with
         | comprehensive research on pleiotropic effects across multiple
         | omics levels. real breakthrough will come from integrating
         | these protein design capabilities with a deeper understanding
         | of complex biological systems and developing strategies for
         | precise delivery and regulation of these novel proteins in
         | vivo.
        
         | ak217 wrote:
         | This research is focused on modeling individual protein binding
         | sites. Pleiotropic effects and off-target side effects are
         | caused by interactions beyond the individual binding sites. So
         | I don't think this tool by itself will be able to design a
         | protein that acts in the way you describe (and that's putting
         | aside the delivery concerns - how do you get the protein to the
         | right compartment inside the cell?).
         | 
         | But novel binding domain design could be combined with other
         | tools to achieve this effect. You could imagine engineering a
         | lipid nanoparticle coated in antibodies specific to cell types
         | that express particular surface proteins. So you might use this
         | tool to design both the antibody binding domain on the vector
         | and also the protein encoded by the payload mRNA. Not all cell
         | types can be reached and addressed this way, but many can.
        
       | gman83 wrote:
       | What is Google actually doing with these systems? Are they using
       | it to develop new drugs themselves? Or licensing it to the
       | pharmaceutical industry?
        
         | skadamat wrote:
         | Cynical take - mostly to appear like a diverse tech company
         | with lots of different products and services so they don't get
         | regulated for their strength in the search advertising market.
        
           | aurareturn wrote:
           | I think occam's razor might suggest they're trying to
           | diversify their revenue so if search declines, they have
           | fallbacks.
        
             | DiscourseFan wrote:
             | could be both
        
           | kridsdale3 wrote:
           | As a Google engineer, I think it's two things:
           | 
           | - Great for recruitment: You're the most talented $SKILL in
           | the world? Come to the team that is pushing humanity forward
           | in all the ways that matter.
           | 
           | - Larry and Sergei actually care about humanity, and being a
           | billionaire is kind of a side-effect of what they would have
           | done anyway.
        
             | bawolff wrote:
             | Its not like google is the first company to have a will R&D
             | department. Xerox invented the mouse. AT&T invented unix &
             | c. Etc
        
         | VyseofArcadia wrote:
         | I know that you can actually use AlphaFold at least. My wife,
         | microbiologist, told me she's used it a couple of times at
         | work. I don't know what their monetization model is, if her lab
         | had to get a license or anything. But I know scientists are
         | using it.
         | 
         | I play Go recreationally. I don't think I can use AlphaGo (or
         | its successors) directly, but the published research on AlphaGo
         | has inspired other strong Go AIs. Online Go platforms integrate
         | them to offer AI matches as well as analyze games between
         | humans. I also know that professionally ranked players are
         | adopting things learned from AI into their own play, and a lot
         | of traditional joseki (analogous to chess openings) are being
         | rethought based on insights from AI play.
        
           | eitally wrote:
           | It's licensed through Google Cloud as one hosted option, but
           | also open sourced.
        
         | dekhn wrote:
         | IIUC most of the commercialization is done through Isomorphic
         | (https://www.isomorphiclabs.com/). My guess is that Google
         | Research/DM itself wants to stay at the front of the field
         | rather than develop drugs (of which protein design is really
         | just a tiny contribution).
         | 
         | When I worked at Google I made a case for doing protein
         | design/preliminary drug discovery using Google infrastructure
         | and it was well received by the leadership. The leadership at
         | Google is mostly computer scientists who know about, but can't
         | actually do, leading-edge life sciences research, and they want
         | to contribute some amount of Google's resources to advancing
         | the state of the art. That's the only reason exacycle was
         | permitted- because Urs thought we could maybe help save the
         | world with protein design (and it wasn't a good approach
         | because it wasted enormous amounts of power on unbiased
         | sampling of large proteins).
         | 
         | Honestly I don't think Google proper is really a good place for
         | this work to be applied, though. Their attention is easily
         | diverted, they repeatedly fail to commercialize, and most
         | importantly, potential partners are scared Google will steal
         | their data, and replace their business.
        
           | rty32 wrote:
           | I wonder why things seem to work well with Waymo? Google was
           | never in the auto industry, but they were able to create a
           | subsidiary that has become a leader in automated driving
           | system.
        
             | dekhn wrote:
             | Yeah, waymo is a bit of an outlier and I think it's got to
             | be a directive from Larry to spend some amount of
             | money/engineering effort to move it forward, with the
             | expectation that it will transform the world into a better
             | place (rather than generate a lot of revenue for Google).
        
         | TrainedMonkey wrote:
         | Alphabet has a medical division -
         | https://en.wikipedia.org/wiki/Verily . My somewhat cynical take
         | is that most extremely wealthy individuals would like to live
         | longer.
         | 
         | But more immediately this is an interesting and relevant
         | problem to solve, so it servers as a tool to benchmark and
         | improve AI... and the current theory is that at some point AI
         | will {generate unlimited amount of wealth | lead humanity into
         | the post-scarcity society | solve all human problems by
         | eliminating humans}.
        
       | idunnoman1222 wrote:
       | It generates novel candidates doesn't actually generate proteins,
       | and none of these proteins have actually been generated to
       | validate whether these candidates are shit or not
        
         | pertymcpert wrote:
         | Did you read it?
        
       | idunnoman1222 wrote:
       | This is equivalent of ChatGPT generates novel code, but we didn't
       | run it. It probably works though.
        
         | vessenes wrote:
         | Terrible take. The article details independent lab verification
         | with researchers listed by name and a quote from them.
        
         | pertymcpert wrote:
         | Making a comment without reading the article? Who would do such
         | a thing?
        
       | photochemsyn wrote:
       | Interesting work, but there's a huge sector they're missing -
       | industrial enzyme and catalysis design. Most of this field is
       | concerned with small molecule binding - methane, carbon dioxide,
       | ammonia, methanol, acetic acid, etc. Binding is often just the
       | first step, as you're typically trying to do highly specific
       | chemistry, e.g. attaching a single oxygen to methane or a single
       | hydrogen to carbon dioxide, etc.
       | 
       | Working in this area might also be good test of their
       | technological approach, as small-molecule binding can be somewhat
       | challenging, and even evolved biological systems can struggle to
       | achieve high specificity.
        
         | dekhn wrote:
         | I want to mention an interesting industrial enzyme project. If
         | you ever saw the laundry detergent commercial "Protein gets out
         | protein", this is referring to an industrial enzyme in laundry
         | detergent. Many years ago, Genentech had built up a significant
         | capability in proteases, which are proteins that cut other
         | proteins into pieces. In the course of optimizing proteases,
         | they made a thermostable, thermoactive protease. Although it
         | wasn't super useful for Genentech in a drug discovery context,
         | it was recognized that you could put an inactive enzyme into
         | laundry detergent that would be activated when the hot laundry
         | water hit the detergent, and the resulting protease would be
         | good at cleaning stains (many stains are composed of protein-
         | blood, food, etc).
         | 
         | Genentech set up a subsidiary with Corning (the glass company)
         | that owns the IP for this protease and then licensed it to
         | laundry detergent manufacturers; many billions of dollars in
         | revenue. I think this is one of the original patents:
         | https://patentimages.storage.googleapis.com/d9/ca/6f/2fb89ff...
        
         | ray__ wrote:
         | My guess is that this area is much harder to break into-enzymes
         | facilitate challenging chemical transformations by stabilizing
         | high-energy transition states in chemical reactions. These
         | states are usually highly transient and therefore much harder
         | to capture using the structural biology techniques that
         | generate the structural data that AlphaFold and similar methods
         | are trained on. Even though there are many structures of
         | enzymes in the absence of their substrate, I would imagine that
         | the small number of structures for states that represent actual
         | catalytic intermediates would make it difficult for a model to
         | internalize the features that distinguish a good
         | enzyme/catalyst from a bad one.
         | 
         | Another consideration is that most protein structure prediction
         | methods only generate the backbone, and the sidechains are
         | modeled in afterwards. Enzyme efficiency requires sub-A level
         | structural precision in the sidechains that are actually doing
         | the chemistry involved in catalysis, so it could also be the
         | case that the current backbone-centric methods aren't good
         | enough to predict these fine-tuned interactions.
        
         | dataking wrote:
         | Interested observer here, not an expert: My understanding is
         | that they are using another model called FermiNet for chemistry
         | research https://deepmind.google/discover/blog/ferminet-
         | quantum-physi...
        
       | pokot0 wrote:
       | Safety is the new gatekeeping.
        
         | cultofmetatron wrote:
         | new?
        
       | VyseofArcadia wrote:
       | It's extremely refreshing that DeepMind is still working on using
       | AI to solve hard problems instead of attempt to put creatives out
       | of work.
        
         | gman83 wrote:
         | I wonder if the backlash they received from inventing
         | transformers and then allowing OpenAI to eat their lunch has
         | changed their attitude towards how they'll commercialize future
         | inventions.
        
       | Improvement wrote:
       | I am sorry for my naivety, but what is the practical benefits of
       | this?
        
         | space_fountain wrote:
         | It varies, but as the article says it can be used for things
         | like drug discovery. Imagine there's a new virus running
         | rampant. It works by using a very specific protein to latch
         | onto a cells so it can pull itself in. You would like to
         | develop a drug to stop it doing that and one way to do that is
         | to find a protein that wants to strongly latch on to an
         | important part of the virus. If it's holding onto the virus the
         | virus probably won't be able to penetrate cells because you're
         | engineered protein will get in the way. This is part of how
         | antibodies work to stop viral infections naturally
        
       | letitgo12345 wrote:
       | One question is how specific the binding is -- what's the level
       | of off-target effects, etc.
        
       | flobosg wrote:
       | In my humble opinion, this work is not that innovative: _de novo_
       | protein binders have been done to death, either by AI approaches
       | or otherwise. Check out the work by David Baker's group, for
       | instance. They have a myriad of examples already.
       | 
       | That being said, as others have commented, my hopes are that all
       | these advancements lead _finally_ to reliable design methods for
       | novel biocatalysts, an area that has been stalling for decades,
       | compared to protein folds and binders.
        
       | boywitharupee wrote:
       | what kind of model architecture was used for this? is it safe to
       | assume they used a transformer model or a variant of it?
        
       | animanoir wrote:
       | yeah yeah whatever another protein discovered oh wow... When are
       | we going to see actual results? Hurry up Deepmind!
        
       | muaytimbo wrote:
       | This is interesting work but I think something has been
       | intentionally overlooked. Creating proteins is difficult and it's
       | also unclear how many of these sequences folded into the
       | predicted 3d structure. Small molecule synthesis is still easier,
       | cheaper, and more scalable than protein therapeutics. I think
       | this would've been more impactful had they focused on improving
       | on the SOTA small molecule - protein interaction models.
        
         | flobosg wrote:
         | > it's also unclear how many of these sequences folded into the
         | predicted 3d structure
         | 
         | The whitepaper depicts some successful cases, determined by
         | X-ray crystallography or cryo-EM.
        
       | parhamn wrote:
       | Question for bio folks here, and not to steal from the joy of
       | this article but I've been recently curious how far are we from
       | engineering something like a virus that targets a subset of the
       | population (e.g. via specific genetic markers). This sort of tech
       | being commoditized feels much much scary than the LLM safety talk
       | - by a mile.
        
         | bongodongobob wrote:
         | Making proteins is nothing like designing life or viruses. It's
         | barely even related.
        
           | parhamn wrote:
           | Yeah I figured as much. Hence the "[t]his sort of tech" -- I
           | imagine progress would be made soon on those fronts as well?
           | Or am I mistaken?
        
             | bongodongobob wrote:
             | This sort of tech is like inventing a new type of
             | screwdriver and asking how it will affect car production.
        
       ___________________________________________________________________
       (page generated 2024-09-05 23:00 UTC)