[HN Gopher] PyTorch Internals: Ezyang's Blog
       ___________________________________________________________________
        
       PyTorch Internals: Ezyang's Blog
        
       Author : Anon84
       Score  : 400 points
       Date   : 2025-03-22 14:39 UTC (1 days ago)
        
 (HTM) web link (blog.ezyang.com)
 (TXT) w3m dump (blog.ezyang.com)
        
       | nitrogen99 wrote:
       | 2019. How much of this is still relevant?
        
         | kadushka wrote:
         | I'm guessing about 80%
        
           | sidkshatriya wrote:
           | To understand a complex system, sometimes it better to
           | understand a (simpler) model system. Sometimes an older
           | version of the same system is that good model system. This is
           | not true always but a good rule of thumb.
        
         | mlazos wrote:
         | I used this to onboard to the PyTorch team a few years ago.
         | It's useful for understanding the key concepts of the
         | framework. Torch.compile isn't covered but the rest of it is
         | still pretty relevant.
        
       | chuckledog wrote:
       | Great article, thanks for posting. Here's a nice summary of
       | automatic differentiation, mentioned in the article and core to
       | how NN's are implemented: https://medium.com/@rhome/automatic-
       | differentiation-26d5a993...
        
       | vimgrinder wrote:
       | For someone it might help: If you are having trouble reading long
       | articles, try text-to-audio with line highlight. It helps a lot.
       | It has cured my lack of attention.
        
         | PeterStuer wrote:
         | No trouble reading the article. Those slides though. Make my
         | eyes hurt :(
        
           | vimgrinder wrote:
           | they were constantly referred too in the text :/ impossible
           | to skip
        
       | brutus1979 wrote:
       | Is there a video version of this? It seems it is from a talk?
        
       | hargun2010 wrote:
       | I guess its longer version of slides but not new I saw comment
       | from as far back as 2023, nonetheless good content (resharable).
       | 
       | https://web.mit.edu/~ezyang/Public/pytorch-internals.pdf
        
       | smokel wrote:
       | Also interesting in this context is the PyTorch Developer Podcast
       | [1] by the same author. Very comforting to learn about PyTorch
       | internals while doing the dishes.
       | 
       | [1] https://pytorch-dev-podcast.simplecast.com/
        
         | swyx wrote:
         | i think the problem w the podcast format (ironic for me to say)
         | is that it assumes a lot higher familiarity with the apis than
         | is afforded by any visual medium including blogs
        
           | smokel wrote:
           | Agreed, but I'm still very happy that some people try. I'm
           | really not that much interested in the weather or listening
           | to idle chit-chat, and for some reason most podcasts seem to
           | focus on that.
        
       | bilal2vec wrote:
       | See also dev forum roadmaps [1] and design docs (e.g. [2],
       | [3],[4])
       | 
       | [1]: https://dev-discuss.pytorch.org/t/meta-pytorch-
       | team-2025-h1-...
       | 
       | [2]: https://dev-discuss.pytorch.org/t/pytorch-symmetricmemory-
       | ha...
       | 
       | [3]: https://dev-discuss.pytorch.org/t/where-do-
       | the-2000-pytorch-...
       | 
       | [4]: https://dev-discuss.pytorch.org/t/rethinking-pytorch-
       | fully-s...
        
       | alexrigler wrote:
       | This is a fun blast from the near past. I helped organize the
       | PyTorch NYC meetup where Ed presented this and still think it's
       | one of the best technical presentations I've seen. Hand drawn
       | slides for the W. Wish I recorded :\
        
       | aduffy wrote:
       | Edward taught a Programming Languages class I took nearly a
       | decade ago, and clicking through here I immediately recognized
       | the illustrated slides, brought a smile to my face
        
         | lyeager wrote:
         | Me too, he was great. Tried his darndest to help me understand
         | Haskell monads.
        
       | zcbenz wrote:
       | For learning internals of ML frameworks I recommend reading the
       | source code of MLX: https://github.com/ml-explore/mlx .
       | 
       | It is a modern and clean codebase without legacies, and I could
       | understand most things without seeking external articles.
        
         | ForceBru wrote:
         | Why is MLX Apple silicon only? Is there something fundamental
         | that prevents it from working on x86? Are some core features
         | only possible on Apple silicon? Or do the devs specifically
         | refuse to port to x86? (Which is understandable, I guess)
         | 
         | I'm asking because it seems to have nice autodiff
         | functionality. It even supports differentiating array mutation
         | (https://ml-
         | explore.github.io/mlx/build/html/usage/indexing.h...), which is
         | something JAX and Zygote.jl can't do. Instead, both have ugly
         | tricks like `array.at[index].set` and the `Buffer` struct.
         | 
         | So it would be cool to have this functionality on a "regular"
         | CPU.
        
           | zcbenz wrote:
           | Most features are already supported on x86 CPUs, you can pip
           | install mlx on Linux , and you can even use it on Windows (no
           | official binary release yet but it is building and tests are
           | passing).
        
           | saagarjha wrote:
           | I think it relies heavily on unified memory.
        
       | quotemstr wrote:
       | Huh. I'd have written TORCH_CHECK like this:
       | TORCH_CHECK(self.dim() == 1)            << "Expected dim to be a
       | 1-D tensor "           << "but was " << self.dim() << "-D
       | tensor";
       | 
       | Turns out it's possible to write TORCH_CHECK() so that it
       | evaluates the streaming operators only if the check fails. (Check
       | out how glog works.)
        
       | pizza wrote:
       | Btw, would anyone have any good resources on using pytorch as a
       | general-purpose graph library? Like stuff beyond the assumption
       | of nets = forward-only (acyclic) digraph
        
       ___________________________________________________________________
       (page generated 2025-03-23 23:00 UTC)