[HN Gopher] The Illustrated AlphaFold
___________________________________________________________________
The Illustrated AlphaFold
Author : dil8
Score : 282 points
Date : 2024-07-13 15:00 UTC (1 days ago)
(HTM) web link (elanapearl.github.io)
(TXT) w3m dump (elanapearl.github.io)
| inciampati wrote:
| It's so, so complex! I confess I had a sense of this but had no
| idea. We don't even hear which MSA algorithm is used to align the
| protein sequences.
| flobosg wrote:
| Input MSAs are generated with jackhmmer and HHblits and further
| processed, if I recall Alphafold's paper correctly.
| elanapearl wrote:
| Hi, I was one of the authors of this! I think we briefly
| mentioned this in a footnote somewhere (a lot of things got cut
| or moved to footnotes since it is already so long & wanted to
| focus on the ML parts that aren't described elsewhere).
|
| But yes as @Flobosg mentioned, for protein chains they use
| jackhmmer to search 4 of the databases (except when searching
| Uniclust30 + BFD when HHBlits is used instead) and for RNA
| chains they used nhmmer to search then hmmalign to re-align
| these to the query chain.
|
| Hope that helps!
| joelS wrote:
| This is an amazing writeup, thank you. looking forward to going
| through it in more detail.
| tomohelix wrote:
| I consider this a glimpse into how neural networks and "AI"-like
| techs would be implemented in the future. Lots of engineering,
| lots of clever manipulations of known techniques woven together
| with a powerful, well trained, model, at the center.
|
| Right now I think stuff like chatgpt is only at the first step of
| making that foundational model that can generalize and process
| data. There isn't a lot of work going into processing the inputs
| into something the model can best understand (not at the
| tokenizer level, even before that). We have a basic field about
| this i.e. prompt engineers but nothing as sophisticated as
| Alphafold exists for natural language or images yet.
|
| People are stacking LLMs together and putting system prompts in
| to assist this input processing. Maybe when we have some more
| complex systems in place, we can see something resembling a real
| AGI.
| astroalex wrote:
| Some[1] think that things are trending in the opposite
| direction: away from clever manipulations and hard coded domain
| knowledge, and towards large scale general models.
|
| [1]: http://www.incompleteideas.net/IncIdeas/BitterLesson.html
| PoignardAzur wrote:
| Yeah, I was surprised to see the architecture diagram is so
| complex. It's been a while since I saw a design that wasn't
| just "stack more transformer layers".
| sangnoir wrote:
| This made me think of thr differences FPGAs and
| microprocessors - with "more laters" being equivalent to
| "more gates"
| great_tankard wrote:
| This is an awesome writeup that really helped me understand
| what's going on under the hood. I didn't know, for example, that
| for the limited number of PTMs AF3 can handle it has to treat
| every single atom, including those of the main and side chain, as
| an individual token (presumably because PTMs are very
| underrepresented in the PDB?)
|
| Thank you for translating the paper into something this
| structural biologist can grasp.
| mk_stjames wrote:
| I have no prior knowledge on protein folding but nevertheless I
| enjoyed (attempting) to read through this. It's interesting to
| see the complexity in techniques used in comparison to a lot of
| other ML projects today.
___________________________________________________________________
(page generated 2024-07-14 23:01 UTC)