[HN Gopher] The Illustrated AlphaFold
___________________________________________________________________
The Illustrated AlphaFold
Author : dil8
Score : 165 points
Date : 2024-07-13 15:00 UTC (7 hours ago)
(HTM) web link (elanapearl.github.io)
(TXT) w3m dump (elanapearl.github.io)
| inciampati wrote:
| It's so, so complex! I confess I had a sense of this but had no
| idea. We don't even hear which MSA algorithm is used to align the
| protein sequences.
| flobosg wrote:
| Input MSAs are generated with jackhmmer and HHblits and further
| processed, if I recall Alphafold's paper correctly.
| elanapearl wrote:
| Hi, I was one of the authors of this! I think we briefly
| mentioned this in a footnote somewhere (a lot of things got cut
| or moved to footnotes since it is already so long & wanted to
| focus on the ML parts that aren't described elsewhere).
|
| But yes as @Flobosg mentioned, for protein chains they use
| jackhmmer to search 4 of the databases (except when searching
| Uniclust30 + BFD when HHBlits is used instead) and for RNA
| chains they used nhmmer to search then hmmalign to re-align
| these to the query chain.
|
| Hope that helps!
| joelS wrote:
| This is an amazing writeup, thank you. looking forward to going
| through it in more detail.
| tomohelix wrote:
| I consider this a glimpse into how neural networks and "AI"-like
| techs would be implemented in the future. Lots of engineering,
| lots of clever manipulations of known techniques woven together
| with a powerful, well trained, model, at the center.
|
| Right now I think stuff like chatgpt is only at the first step of
| making that foundational model that can generalize and process
| data. There isn't a lot of work going into processing the inputs
| into something the model can best understand (not at the
| tokenizer level, even before that). We have a basic field about
| this i.e. prompt engineers but nothing as sophisticated as
| Alphafold exists for natural language or images yet.
|
| People are stacking LLMs together and putting system prompts in
| to assist this input processing. Maybe when we have some more
| complex systems in place, we can see something resembling a real
| AGI.
| great_tankard wrote:
| This is an awesome writeup that really helped me understand
| what's going on under the hood. I didn't know, for example, that
| for the limited number of PTMs AF3 can handle it has to treat
| every single atom, including those of the main and side chain, as
| an individual token (presumably because PTMs are very
| underrepresented in the PDB?)
|
| Thank you for translating the paper into something this
| structural biologist can grasp.
___________________________________________________________________
(page generated 2024-07-13 23:00 UTC)