[HN Gopher] PyTorch is dead. Long live Jax
___________________________________________________________________
PyTorch is dead. Long live Jax
Author : lairv
Score : 73 points
Date : 2024-08-16 20:24 UTC (2 hours ago)
(HTM) web link (neel04.github.io)
(TXT) w3m dump (neel04.github.io)
| casualscience wrote:
| > Multi-backend is doomed
|
| Finally! Someone says it! This is why the C programming language
| will never have wide adoption. /s
| munchler wrote:
| This is such a hyperbolic headline it's hard to be interested in
| reading the actual article.
| crancher wrote:
| All blanket statements are false.
| semiinfinitely wrote:
| PyTorch is the javascript of ML. sadly "worse is better" software
| has better survival characteristics even when there is consensus
| that technology X is theoretically better
| bob020202 wrote:
| Nothing is even theoretically better than Javascript for its
| intended use cases, web frontends and backends. Mainly because
| it went all-in on event loop parallelism early-on, which isn't
| just for usability but also performance. And didn't go all-in
| on OOP unlike Java, and has easy imports/packages unlike
| Python. It has some quirks like the "0 trinity," but that
| doesn't really matter. No matter how good you are with
| something else, it still takes more dev time ($$$) than JS.
|
| Now it's been forever since I used PyTorch or TF, but I only
| remember TF 1.x being more like "why TF isn't this working." At
| some point I didn't blame myself, I blamed the tooling, which
| TF2 later admitted. It seemed like no matter how skilled I got
| with TF1, it'd always take much longer than developing with
| PyTorch, so I switched early.
| josephg wrote:
| You don't think its possible to (even theoretically) improve
| on javascript for its intended use case? What a terrific lack
| of imagination.
|
| Typescript and Elm would like a word
| bob020202 wrote:
| No, I said there's nothing that exists right now that's
| theoretically better. Typescript isn't. It'd be great if
| TS's type inference were smart enough that it basically
| takes no additional dev input vs JS, but until then, it's
| expensive to use. It's also bolted on awkwardly, but that's
| changing soon. Could also imagine JS getting some nice Py
| features like list comp.
|
| Also, generally when people complain that JS won the web,
| it's not because they prefer TS, it's cause they wanted to
| use something else and can't.
|
| Never used Elm, but... no variables, kinda like Erlang,
| which I've used. That has its appeal, but you're not going
| to find a consensus that this is better for web.
| dwattttt wrote:
| I see this a lot, and I want to find the right words for
| it: I don't want the types automatically determined for
| what I write, because I write mistakes.
|
| I want to write the type, and for that to reveal the
| mistake.
| bob020202 wrote:
| If you explicitly mark types at lower levels, or just use
| typed libs, it's not very easy to pass the wrong type
| somewhere and have it still accidentally work. The most
| automatic type inference I can think of today is in Rust,
| which is all about safety, or maybe C++ templates count
| too.
| sva_ wrote:
| I don't think the comparison is fair. Imo PyTorch has the
| cleanest abstractions, which is the reason it is so popular.
| People can do quick prototyping without having to spend too
| much time figuring out the engineering details that make their
| hardware run it.
| ein0p wrote:
| Jax is dead, long live PyTorch. PyTorch has _twenty times_ as
| many users as Jax. Any rumors of its death are highly exaggerated
| melling wrote:
| They used to say the same thing about Perl and Python
|
| Downvoted. Hmmm. I'm a little tired so I don't want to go into
| detail. However, I was a Perl programmer when Python was
| rising. So, needless to say, having a big lead doesn't matter.
|
| Please learn from history. A big lead means nothing.
| ein0p wrote:
| It's been years and Jax is just where it was, no growth
| whatsoever. And that's with all of Google forced internally
| to use only Jax. Look, I like the technical side of Jax for
| the most part, but it's years too late to the party and it's
| harder to use than PyTorch. It just isn't going to ever take
| off at this point.
| deisteve wrote:
| lot of contrarian takes are popular but rarely implemented in
| reality
| tripplyons wrote:
| It's definitely exaggerated, but I personally prefer JAX and
| have found it easier to use than PyTorch for almost everything.
| If you haven't already, I would give JAX a good try.
| srush wrote:
| PyTorch is a generationally important project. I've never seen a
| tool that is so inline with how researchers learn and internalize
| a subject. Teaching Machine Learning before and after its
| adoption has been a completely different experience. Never can be
| said enough how cool it is that Meta fosters and supports it.
|
| Viva PyTorch! (Jax rocks too)
| hprotagonist wrote:
| > I believe that all infrastructure built on Torch is just a huge
| pile of technical debt, that will haunt the field for a long,
| long time.
|
| ... from the company that pioneered the approach with tensorflow.
| I've worked with worse ML frameworks, but they're by now pretty
| obscure; i cannot remember (and i am very happy about it) the
| last time i saw MXNet in the wild, for example. You'll still find
| Caffe on some embedded systems, but you can mostly sidestep it.
| deisteve wrote:
| i like pytorch because all the academia release their code with
| it
|
| ive never even heard of jax nor will i have the skills to use it
|
| i literally just want to know two things: 1) how much vram 2) how
| to run it on pytorch
| sva_ wrote:
| Jax is a competing computational framework that does something
| similar to PyTorch, so both of your questions don't really make
| sense.
| etiam wrote:
| Maybe deisteve will answer for himself, but I don't think
| that's meant to mean how to run Jax on Pytorch, but rather
| the questions he's interested in for any published model.
| 0cf8612b2e1e wrote:
| As we all know, the technically best solution always wins.
| logicchains wrote:
| PyTorch beat Tensorflow because it was much easier to use for
| research. Jax is much harder to use for exploratory research than
| PyTorch, due to requiring a fixed shape computation graph, which
| makes implementing many custom model architectures very
| difficult.
|
| Jax's advantages shine when it comes to parallelizing a new
| architecture across multiple GPU/TPUs, which it makes much easier
| than PyTorch (no need for custom cuda/networking code). Needing
| to scale up a new architecture across many GPUs is however not a
| common use-case, and most teams that have the resources for
| large-scale multi-gpu training also have the resources for
| specialised engineers to do it in PyTorch.
| funks_ wrote:
| I wish dex-lang [1] had gotten more traction. It's JAX without
| the limitations that come from being a Python DSL. But ML
| researchers apparently don't want to touch anything that doesn't
| look exactly like Python.
|
| [1]: https://github.com/google-research/dex-lang
| hatmatrix wrote:
| It seems like an experimental research language.
|
| Julia also competes in this domain from a more practical
| standpoint and has less limitations than JAX as I understand
| it, but is less mature and still working on getting wider
| traction.
| funks_ wrote:
| The Julia AD ecosystem is very interesting in that the
| community is trying to make the entire language
| differentiable, which is much broader in scope than what
| Torch and JAX are doing. But unlike Dex, Julia is not a
| language built from the ground up for automatic
| differentiation.
|
| Shameless plug for one of my talks at JuliaCon 2024:
| https://www.youtube.com/live/ZKt0tiG5ajw?t=19747s. The
| comparison between Python and Julia starts at 5:31:44.
| mccoyb wrote:
| Dex is also missing user authored composable program
| transformations, which is one of JAX's hidden superpowers.
|
| So not quite "JAX without limitations" -- but certainly without
| some of the limitations.
| funks_ wrote:
| Are you talking about custom VJPs/JVPs?
| mccoyb wrote:
| No, I'm talking about custom `Jaxpr` interpreters which can
| modify programs to do things.
| sva_ wrote:
| > I've personally known researchers who set the seeds in the
| wrong file at the wrong place and they weren't even used by torch
| at all - instead, were just silently ignored, thus invalidating
| all their experiments. (That researcher was me)
|
| Some _assert_ -ing won't hurt you. Seriously. It might even help
| keeping your sanity.
| marcinzm wrote:
| My main reason to avoid Jax is Google. Google doesn't provide
| good support even for things you pay them for. They do things
| because they want to, to get their internal promotions,
| irrespective of their customers or the impact on them.
| smhx wrote:
| the author got a couple of things wrong, that are worth pointing
| out:
|
| 1. PyTorch is going all-in on torch.compile -- Dynamo is the
| frontend, Inductor is the backend -- with a strong default
| Inductor codegen powered by OpenAI Triton (which now has CPU,
| NVIDIA GPU and AMD GPU backends). The author's view that PyTorch
| is building towards a multi-backend future isn't really where
| things are going. PyTorch supports extensibility of backends
| (including XLA), but there's disproportionate effort into the
| default path. torch.compile is 2 years old, XLA is 7 years old.
| Compilers take a few years to mature. torch.compile will get
| there (and we have reasonable measures that the compiler is on
| track to maturity).
|
| 2. PyTorch/XLA exists, mainly to drive a TPU backend for PyTorch,
| as Google gives no other real way to access the TPU. It's not
| great to try shoe-in XLA as a backend into PyTorch -- as XLA
| fundamentally doesn't have the flexibility that PyTorch supports
| by default (especially dynamic shapes). PyTorch on TPUs is
| unlikely to ever have the experience of JAX on TPUs, almost by
| definition.
|
| 3. JAX was developed at Google, not at Deepmind.
| cs702 wrote:
| A more accurate title for the OP would be "I _hope and wish_
| PyTorch is dead. Long live Jax. " Leaving aside the fact that
| PyTorch's ecosystem is 10x to 100x larger, depending on how you
| measure it, PyTorch's biggest advantage, in my experience, is
| that it is picked up quickly by developers who are new to it.
| Jax, despite its superiority, or maybe because of it, is not.
| equinox does a great job of making Jax accessible, but its
| functional approach remains more difficult to learn and master
| than PyTorch's object-oriented one.
___________________________________________________________________
(page generated 2024-08-16 23:00 UTC)