[HN Gopher] PyTorch Internals: Ezyang's Blog
___________________________________________________________________
PyTorch Internals: Ezyang's Blog
Author : Anon84
Score : 400 points
Date : 2025-03-22 14:39 UTC (1 days ago)
(HTM) web link (blog.ezyang.com)
(TXT) w3m dump (blog.ezyang.com)
| nitrogen99 wrote:
| 2019. How much of this is still relevant?
| kadushka wrote:
| I'm guessing about 80%
| sidkshatriya wrote:
| To understand a complex system, sometimes it better to
| understand a (simpler) model system. Sometimes an older
| version of the same system is that good model system. This is
| not true always but a good rule of thumb.
| mlazos wrote:
| I used this to onboard to the PyTorch team a few years ago.
| It's useful for understanding the key concepts of the
| framework. Torch.compile isn't covered but the rest of it is
| still pretty relevant.
| chuckledog wrote:
| Great article, thanks for posting. Here's a nice summary of
| automatic differentiation, mentioned in the article and core to
| how NN's are implemented: https://medium.com/@rhome/automatic-
| differentiation-26d5a993...
| vimgrinder wrote:
| For someone it might help: If you are having trouble reading long
| articles, try text-to-audio with line highlight. It helps a lot.
| It has cured my lack of attention.
| PeterStuer wrote:
| No trouble reading the article. Those slides though. Make my
| eyes hurt :(
| vimgrinder wrote:
| they were constantly referred too in the text :/ impossible
| to skip
| brutus1979 wrote:
| Is there a video version of this? It seems it is from a talk?
| hargun2010 wrote:
| I guess its longer version of slides but not new I saw comment
| from as far back as 2023, nonetheless good content (resharable).
|
| https://web.mit.edu/~ezyang/Public/pytorch-internals.pdf
| smokel wrote:
| Also interesting in this context is the PyTorch Developer Podcast
| [1] by the same author. Very comforting to learn about PyTorch
| internals while doing the dishes.
|
| [1] https://pytorch-dev-podcast.simplecast.com/
| swyx wrote:
| i think the problem w the podcast format (ironic for me to say)
| is that it assumes a lot higher familiarity with the apis than
| is afforded by any visual medium including blogs
| smokel wrote:
| Agreed, but I'm still very happy that some people try. I'm
| really not that much interested in the weather or listening
| to idle chit-chat, and for some reason most podcasts seem to
| focus on that.
| bilal2vec wrote:
| See also dev forum roadmaps [1] and design docs (e.g. [2],
| [3],[4])
|
| [1]: https://dev-discuss.pytorch.org/t/meta-pytorch-
| team-2025-h1-...
|
| [2]: https://dev-discuss.pytorch.org/t/pytorch-symmetricmemory-
| ha...
|
| [3]: https://dev-discuss.pytorch.org/t/where-do-
| the-2000-pytorch-...
|
| [4]: https://dev-discuss.pytorch.org/t/rethinking-pytorch-
| fully-s...
| alexrigler wrote:
| This is a fun blast from the near past. I helped organize the
| PyTorch NYC meetup where Ed presented this and still think it's
| one of the best technical presentations I've seen. Hand drawn
| slides for the W. Wish I recorded :\
| aduffy wrote:
| Edward taught a Programming Languages class I took nearly a
| decade ago, and clicking through here I immediately recognized
| the illustrated slides, brought a smile to my face
| lyeager wrote:
| Me too, he was great. Tried his darndest to help me understand
| Haskell monads.
| zcbenz wrote:
| For learning internals of ML frameworks I recommend reading the
| source code of MLX: https://github.com/ml-explore/mlx .
|
| It is a modern and clean codebase without legacies, and I could
| understand most things without seeking external articles.
| ForceBru wrote:
| Why is MLX Apple silicon only? Is there something fundamental
| that prevents it from working on x86? Are some core features
| only possible on Apple silicon? Or do the devs specifically
| refuse to port to x86? (Which is understandable, I guess)
|
| I'm asking because it seems to have nice autodiff
| functionality. It even supports differentiating array mutation
| (https://ml-
| explore.github.io/mlx/build/html/usage/indexing.h...), which is
| something JAX and Zygote.jl can't do. Instead, both have ugly
| tricks like `array.at[index].set` and the `Buffer` struct.
|
| So it would be cool to have this functionality on a "regular"
| CPU.
| zcbenz wrote:
| Most features are already supported on x86 CPUs, you can pip
| install mlx on Linux , and you can even use it on Windows (no
| official binary release yet but it is building and tests are
| passing).
| saagarjha wrote:
| I think it relies heavily on unified memory.
| quotemstr wrote:
| Huh. I'd have written TORCH_CHECK like this:
| TORCH_CHECK(self.dim() == 1) << "Expected dim to be a
| 1-D tensor " << "but was " << self.dim() << "-D
| tensor";
|
| Turns out it's possible to write TORCH_CHECK() so that it
| evaluates the streaming operators only if the check fails. (Check
| out how glog works.)
| pizza wrote:
| Btw, would anyone have any good resources on using pytorch as a
| general-purpose graph library? Like stuff beyond the assumption
| of nets = forward-only (acyclic) digraph
___________________________________________________________________
(page generated 2025-03-23 23:00 UTC)