[HN Gopher] PyTorch 2.0 Release
___________________________________________________________________
PyTorch 2.0 Release
Author : DreamFlasher
Score : 130 points
Date : 2023-03-15 20:57 UTC (2 hours ago)
(HTM) web link (pytorch.org)
(TXT) w3m dump (pytorch.org)
| [deleted]
| mdaniel wrote:
| discussion from (presumably) the PyTorch Conference announcement:
| https://news.ycombinator.com/item?id=33832511
| brucethemoose2 wrote:
| I'm hoping torch.compile is a gateway to "easy" non-Nvidia
| accelerator support in PyTorch.
|
| Also, I have been using torch.compile for the Stable Diffusion
| unet/vae since February, to good effect. I'm guessing similar
| optimizations will pop up for LLaMA.
| datadeft wrote:
| Could you give a bit more details about this? Do you have a
| link?
| brucethemoose2 wrote:
| See the above reply ^
| voz_ wrote:
| Is there somewhere I can see your Stable Diffusion +
| torch.compile code? I am interesting in how you integrated.
| brucethemoose2 wrote:
| In `diffusers` implementations (like InvokeAI) its pretty
| easy: https://github.com/huggingface/diffusers/blob/42beaf1d2
| 3b5cc...
|
| But I also compile the VAE and some other modules, I will
| reply again later when I can look at my local code. Some
| modules (like face restoration or the scheduler) still dont
| like torch.compile.
|
| For the Automatic1111 repo (and presumably other original
| Stability AI implementations), I just add `m.model =
| torch.compile(m.model)` here:
| https://github.com/AUTOMATIC1111/stable-diffusion-
| webui/blob...
|
| I tried changing the options in the config dict one by one,
| but TBH nothing seems to make a significant difference behind
| the default settings in benchmarks.
|
| I haven't messed with compiling LORA training yet, as I dont
| train much and it is sufficiently fast, but I'm sure it could
| be done.
| brucethemoose2 wrote:
| Here is the InvokeAI code, minus the codeformer/gfpgan
| changes that dont work yet:
|
| https://gist.github.com/brucethemoose/ea64f498b0aa51adcc88f5.
| ..
|
| I intend to start some issues for this on the repo soon(TM).
| fpgaminer wrote:
| The thing I'm looking forward to most is having Flash Attention
| built-in. Right now you have to use xformers or similar, but that
| dependency has been a nightmare to use, from breaking, to
| requiring specific concoctions of installing dependencies or else
| conda will barf, to being impossible to pin because I have to use
| -dev releases which they constantly drop from the repositories.
|
| PyTorch 2.0 comes with a few different efficient transformer
| implementations built-in. And unlike 1.13, they work during
| training and don't require specific configurations. Seemed to
| work just fine during my pre-release testing. Also, having it
| built into PyTorch might mean more pressure to keep it optimized.
| As-is xformers targets A100 primarily, with other archs as an
| afterthought.
|
| And, as promised, `torch.compile` worked out of the box,
| providing IIRC a nice ~20% speed up on a ViT without any other
| tuning.
|
| I did have to do some dependency fiddling on the pre-release
| version. Been looking forward to the "stable" release before
| using it more extensively.
|
| Anyone else seeing nice boosts from `torch.compile`?
| mardifoufs wrote:
| >Python 3.11 support on Anaconda Platform
|
| >Due to lack of Python 3.11 support for packages that PyTorch
| depends on, including NumPy, SciPy, SymPy, Pillow and others on
| the Anaconda platform. We will not be releasing Conda binaries
| compiled with Python 3.11 for PyTorch Release 2.0. The Pip
| packages with Python 3.11 support will be released, hence if you
| intend to use PyTorch 2.0 with Python 3.11 please use our Pip
| packages.
|
| It really sucks that anaconda always lags behind. I know the
| reasoning*, and I know it makes sense for what a lot of teams use
| it for... but on our side we are now looking more and more into
| dropping it since we are more of an R&D team. We already use
| containers for most of our pipelines, so just using pip might be
| viable.
|
| *Though I guess Anaconda chewed more than it can handle w.r.t
| managing an entire Python universe, and keeping up to date.
| Conda-forge is already almost a requirement but using the
| official package (with pip, in this case) has its own benefits
| for very complex packages like pytorch.
| DreamFlasher wrote:
| Afaik NumPy, SciPy, SymPy and Pillow are not managed/owned by
| Anaconda? At least here: https://numpy.org/about/ Anaconda
| isn't mentioned.
| DreamFlasher wrote:
| Ah, yeah they do have a Python 3.11 release, just not on
| anaconda. Okay, yeah, for a couple of years now there isn't a
| good reason anymore to use anaconda anyways.
| mardifoufs wrote:
| Yes that's the issue! Most of the software is already
| ready, usable and just works... unless you use anaconda.
| Now that I think about it, is there some technical reason
| for that? I always thought it was mostly about stability,
| but I can't imagine python 3.11 being so unstable as to
| warrant waiting a whole year before even porting.
| brucethemoose2 wrote:
| The Arch Linux PyTorch 2.0 packages are great if you are
| looking for "cutting edge," as they are compiled against CUDA
| 12.1 now, instead of 11.8 like the official nightly releases.
| You can also get AVX2 patched Python and optimized C Python
| packages through CachyOS or ALHP.
|
| But even Arch is still stuck on Python 3.10
| simonw wrote:
| "the MPS backend" - that's the thing that lets Torch run
| accelerated on M1/M2 Macs!
| datadeft wrote:
| Yes, I am not sure at what extent is MPS a viable alternative
| to CUDA. You seem to write a lot about ML models. Do you have a
| detailed write about this subject?
| sebzim4500 wrote:
| Based on George Hotz's testing it is very broken. It's possible
| it has improved since then, I guess but he streamed this a few
| weeks ago.
| dagmx wrote:
| It supports a subset of the operators (as mentioned in the
| release notes). I don't think it's broken for the ones that
| it does support though.
| mochomocha wrote:
| That's been my experience. However when fallback to CPU
| happens, it sometimes end up making a specific graph
| execution slower. But that's explicitly mentioned by the
| warning and pretty much expected.
| brucethemoose2 wrote:
| You'd think it would fall back to GPU/CPU for unsupported
| operations instead of failing, but I guess thats easier
| said than done.
| norgie wrote:
| Yes, this is my experience. Many off the shelf models still
| don't work, but several of my own models work great as long
| as they don't use unsupported operators.
| glial wrote:
| Where can I find a list of the supported operators?
| bigbillheck wrote:
| Based on George Hotz's performance at twitter I wouldn't bet
| he wasn't holding it wrong.
| jeron wrote:
| so, you would bet he was holding it wrong?
| danieldk wrote:
| We tested inference for all spaCy transformer models and they
| work:
|
| https://explosion.ai/blog/metal-performance-shaders
|
| It depends very much on the ops that your model is using.
___________________________________________________________________
(page generated 2023-03-15 23:00 UTC)