[HN Gopher] Run Stable Diffusion on Intel CPUs
___________________________________________________________________
Run Stable Diffusion on Intel CPUs
Author : amrrs
Score : 88 points
Date : 2022-08-29 19:13 UTC (3 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| polskibus wrote:
| can't get it to install requirements on Windows with Python 3.10
| and MS Build Tools 2022. Any tips?
| smoldesu wrote:
| I found a pretty good Docker container for it, though that's
| only really switching you from solving Python problems to
| Docker ones. Worth trying out if you have a Linux box or WSL
| installed though: https://github.com/AbdBarho/stable-diffusion-
| webui-docker
| desindol wrote:
| It needs python 3.9.
| amrrs wrote:
| On reddit I found some older GPUs take about 5 mins and here this
| video[1] says 5 mins for CPU using this OpenVino library. Not
| sure if OpenVino makes CPU chips compete with GPUs. Has anyone
| heard of OpenVino before ?
|
| 1.https://youtu.be/5iXhhf7ILME
| minimaxir wrote:
| OpenVINO is developed by Intel themselves, and is one of many
| methods to freeze models to make CPU inference possible and
| performant.
|
| https://en.wikipedia.org/wiki/OpenVINO
| T-A wrote:
| https://github.com/openvinotoolkit/openvino#supported-hardwa...
| torotonnato wrote:
| 7' 12" on an ancient Intel Core i5-3350P CPU @ 3.10GHz (!) using
| BERT BasicTokenizer, default arguments
| [deleted]
| aaaaaaaaaaab wrote:
| Love this. OpenAI are _livid_. :^)
| enchiridion wrote:
| Why?
| yieldcrv wrote:
| Where can i get up to speed on what's coming down there pipeline
| in this ai/ml image making scene?
|
| (And learn the agreed upon terms)
| aaaaaaaaaaab wrote:
| Noone can tell.
|
| Pandora's box has been opened.
|
| Nothing is true, everything is permitted.
| yayr wrote:
| then - how far away are we from having it on M1/M2 Macs, at least
| with regular processing? openvino may be one path I suppose:
| https://github.com/openvinotoolkit/openvino/issues/11554
| homarp wrote:
| PyTorch for m1 (https://pytorch.org/blog/introducing-
| accelerated-pytorch-tra... ) will not work:
| https://github.com/CompVis/stable-diffusion/issues/25 says
| "StableDiffusion is CPU-only on M1 Macs because not all the
| pytorch ops are implemented for Metal. Generating one image
| with 50 steps takes 4-5 minutes."
| andybak wrote:
| By comparison I can generate 512x512 images every 15 seconds
| on an RTX 3080 (although there's an initial 30 second setup
| penalty for each run)
| yayr wrote:
| those guys are also working on it atm :-)
| https://github.com/lstein/stable-diffusion/pull/179
| yayr wrote:
| looks like there is an easier path using metal shaders:
| https://dev.to/craigmorten/setting-up-stable-diffusion-for-m...
|
| and https://github.com/magnusviri/stable-diffusion/tree/apple-
| si...
| zmmmmm wrote:
| this worked fine for me, and running side by side with Intel
| CPU + nVidia 2070 it actually does not take much longer (and
| as a sibling said, seems to be working at full precision). It
| is one of the first things I've done that has properly made
| my M1 Max's fan spin up hard though!
| garblegarble wrote:
| I've been using this on my M1 Max and it works pretty well,
| 1.65 iterations per second (full precision, whereas my PC's
| 3080 can only do half-precision due to limited memory)... a
| 50-iteration image in about 40 seconds or so.
| MattRix wrote:
| Your 3080 should be able to do full precision. Are you sure
| you don't have the batch size set greater than 1, or
| another issue along those lines?
| garblegarble wrote:
| Thank you and smoldesu for letting me know it should
| work, I'll have a better look into what's going on - it
| didn't immediately work on Windows in full precision
| (probably a batch size issue as you suggested) and I gave
| up...
|
| I shouldn't have given up so easily, but my tolerance for
| annoyances on Windows is pretty low (that Windows machine
| is kept for gaming, the last time I used a Windows
| machine for anything but launching Steam was when Windows
| 2000 was the hot new thing...)
| smoldesu wrote:
| > full precision, whereas my PC's 3080 can only do half-
| precision due to limited memory
|
| What model are you using? I've been running full-precision
| SD1.4 on my 3070, albeit with less than 10% VRAM headroom.
| pmalynin wrote:
| I got it working in about an hour on M1 ultra, mostly compiling
| things and having to tweak some model code to be compatible
| with metal. It works pretty well, about 1/10 to 1/20 of
| performance I can get on a 3080.
| motoboi wrote:
| openvino is an unsung hero.
| ByThyGrace wrote:
| What's the status of running SD on AMD GPUs?
| homarp wrote:
| https://rentry.org/tqizb explains how to install ROCm and then
| pytorch for ROCm
|
| ROCm does not support APU, here is the list of supported GPU:
| https://docs.amd.com/bundle/Hardware_and_Software_Reference_...
| synergy20 wrote:
| what does APU mean here?
| ace2358 wrote:
| Cpu + gpu on the same die/chiplet thing. So integrated gpu
| in and marketing speak is apu
| barkingcat wrote:
| Amd's integrated gpu together with the processor.
| mysterydip wrote:
| I didn't see any requirements on the page beyond a CPU on that
| list. Do you need a certain amount of RAM? Will more speed things
| up to a degree?
___________________________________________________________________
(page generated 2022-08-29 23:00 UTC)