[HN Gopher] More Protein Folding Progress - What's It Mean?
       ___________________________________________________________________
        
       More Protein Folding Progress - What's It Mean?
        
       Author : pvsukale3
       Score  : 74 points
       Date   : 2021-07-26 17:26 UTC (5 hours ago)
        
 (HTM) web link (blogs.sciencemag.org)
 (TXT) w3m dump (blogs.sciencemag.org)
        
       | joshtam wrote:
       | AF2 certainly moves us forward, especially for proteins where no
       | structure was previously available
        
       | COGlory wrote:
       | I am a structural biologist studying Archaeal viruses and
       | CRISPR/Cas proteins. From my point of view, AlphaFold has
       | basically just gotten better at multiple sequence alignments.
       | It's not a bad thing, but it's unfortunate useless to me because
       | sequence divergence happens so quickly in the organisms I study
       | that even the best results are still basically made up. It's nice
       | that AlphaFold got better at generating sequence alignments, but
       | it's not a magic bullet (a la folding figured out from first
       | principles.)
       | 
       | Interestingly enough, if I get experimental data of an archeal
       | virus protein, it almost always uses a conserved fold. There's
       | just no evidence at the amino acid level.
        
         | the__alchemist wrote:
         | I agree. AlphaFold's approach isn't what I was hoping.
         | Something ab initio would be ground-breaking. Especially if you
         | could apply it to chemistry more broadly than protein folding.
         | AlphaFold's approach seems like a recipe for over-fitting.
        
           | Filligree wrote:
           | I would be honestly surprised if a true simulation is
           | possible that can also run on a classical (non-quantum)
           | computer, but I've been surprised before.
           | 
           | It would indeed be ground-breaking.
        
             | dnautics wrote:
             | One crazy idea I have is to run some very crude ab-initio
             | QM or DFT stuff starting with folded proteins, and
             | gradually running the temperature higher until it unfolds.
             | Then amass a dataset of protein structures + positional
             | delta vectors. Then time-reverse the dataset (flip the sign
             | on those vectors). Then train a 3d convolutional NN on the
             | reverse-melting curve to obtain heuristic rules for folding
             | in whatever universe the shitty physics engine represents.
             | 
             | Then it doesn't matter if the QM simulation is very crude
             | and deeply flawed, so long as it gets to the right answer
             | at the end.
        
               | dekhn wrote:
               | IIUC the folding and unfolding pathways of a protein are
               | not time reversed wrt each other.
               | 
               | But you would still enjoy reading this:
               | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC17732/ the
               | work in this paper led to folding@home because vijay
               | couldn't get enough computer time to run his simulations
               | 
               | if you really had a lot of computer time to waste, you
               | could imagine doing simulations where you titired in or
               | out some guanadinum chloride and inspected how disrupting
               | h-bonds (versus hydrophobic collapse) contributes.
               | Chaotropes are better than temperature for probing
               | unfolding.
        
             | dekhn wrote:
             | Check out DESMOND and ANTON, that's basically what DE Shaw
             | Research is doing. It does seem like, to get static
             | structure predictions, it's going to be hard for anybody to
             | find anything that does marginally better than AF2 at this
             | point, but since static structure predicitons are just
             | mostly useful for brainstorming, I think ANTON may end up
             | being more useful, in terms of applied science.
        
             | eutectic wrote:
             | I wonder if a transformer with 1 recurrent layer (or a
             | transformer Deep Equilibrium Model) could work well.
             | Transformers are almost like a physics simulation in that
             | they sum vector-valued interactions which depend on
             | distance in some space, and then add the result to the
             | state of each particle / element.
        
               | dekhn wrote:
               | Reasonably speaking, I would expect that within 2 years
               | 10 groups will be as proficient as AF2 at predicting
               | static structures. I don't think anybody who is trying to
               | emulate physics simulations will be in that group, just
               | folks who have learned enough tricks to quickly
               | incorporate all the evidence during training and choose
               | how to apply it during prediction.
               | 
               | I expect to see a "multiple feature embedding heads on
               | top of 2 fullly connected layers" (as used in modern ads
               | training) will end up being the simplest architecture
               | capable of folding proteins well.
        
           | isoprophlex wrote:
           | One can dream about being able to calculate energies with the
           | accuracy of DFT calculations... and do dynamic simulations on
           | the time scale of ball-and-stick molecular modeling sims.
           | 
           | Would be amazing for homogeneous catalysis design.
        
         | ramraj07 wrote:
         | Can you explain further what your second paragraph means?
        
           | G3rn0ti wrote:
           | I think parent meant conservation of the amino acid is weak
           | and still the structure remains the same overall. So sequence
           | similarity is not everything.
           | 
           | Reason might be the overall protein fold is guided also by
           | something else than detailed side chain contacts.
           | 
           | BTW: Hydrogen bonding and salt contacts do not drive protein
           | folding at least not thermodynamically because it does not
           | matter whether polar/charged residues interact with others or
           | with water. Rather, the reason why proteins fold is the same
           | why oil and water do not mix: Hydrophobic amino acids avoid
           | water. This is an entropy driven process where electrostatic
           | interactions do not matter. See also the ,,molten globules"
           | model. Basically it means a predecessor of the protein folds
           | early on due to a collapse of the hydrophobic core. Tertiary
           | structure is then refined due to residue/residue
           | interactions. In the end, it's the distribution of
           | hydrophobic amino acids in its sequence that's most important
           | for the conservation of a structure. Surface residues can
           | vary quite a lot.
        
             | dekhn wrote:
             | your "BTW" is still a huge area of argument in protein
             | folding, the claim you are making is just one perspective
             | and is not well-established.
        
         | strbean wrote:
         | I never considered that there would be viruses that infect
         | Archaea. That sounds incredibly cool!
         | 
         | Any fun tidbits about them you'd like to share?
        
           | kleton wrote:
           | There's a theory that the eukaryotic nucleus originated as an
           | archaeal virus
           | https://en.wikipedia.org/wiki/Viral_eukaryogenesis
        
             | strbean wrote:
             | That is super cool!
        
         | jostmey wrote:
         | How do you know the results are _basically made up_? Have you
         | compared AlphaFold 's predictions to your experimental data.
         | 
         | I think you are right that most of the predictive power derives
         | from super-enhanced multiple sequence alignments, but I think
         | you underestimate AlphaFold's ability to generalize to novel
         | cases
        
         | mrfusion wrote:
         | Could people like you help to improve alphafold? Did they
         | already train it on the proteins you work with?
        
       ___________________________________________________________________
       (page generated 2021-07-26 23:01 UTC)