[HN Gopher] Easy Stable Diffusion XL in your device, offline
___________________________________________________________________
Easy Stable Diffusion XL in your device, offline
Author : haebom
Score : 219 points
Date : 2023-12-01 14:34 UTC (8 hours ago)
(HTM) web link (noiselith.com)
(TXT) w3m dump (noiselith.com)
| ProllyInfamous wrote:
| The 16GB (base model) M2 Pro Mini, despite its overall
| awesomeness (running DiffusionBee.app / etc)... does not meet
| Minimum System Requirements (Apple Silicon requires 32GB RAM).
|
| So now I have to contemplate shopping for a new mac TWICE in one
| year (never happened before).
| sophrocyne wrote:
| https://github.com/invoke-ai/InvokeAI - runs on Mac silicon,
| can squeeze out SDXL images on a 16gb mac with SSD-1B or Turbo
| models.
| wsgeorge wrote:
| Currently using SDXL (through Huggingface Diffusers) on an M1
| 16GB Mac. Takes on average 4-5mins to generate an image. It's
| usable.
| ttul wrote:
| Good lord. I can get a 2048x2048 upscaled output from a very
| complex ComfyUI workflow on a 4090 in 15 seconds. This
| includes three IPAdapter nodes, a sampling stage, a three-
| stage iterative latent upscaler, and multiple ControlNets.
| Macs are not close to competitive for inference yet.
| rsynnott wrote:
| I mean, a 4090 would appear to cost $2000, and came out a
| year ago; it has about 70bn transistors. The M1 could be
| had for $700 for a desktop, $1000 as part of a laptop, came
| out three years ago, and has 16bn transistors, some of
| which are CPU.
|
| An M3 Ultra might be a more reasonable comparison for the
| 4090.
| michaelt wrote:
| 24GB cards weren't always $2000. I've seen people on this
| very forum [1] who brought two 3090s for just $600 each.
|
| Agree the prices are crazy right now, though.
|
| [1] https://news.ycombinator.com/item?id=37438847
| myself248 wrote:
| When choosing a machine with non-expandable RAM, you went with
| the minimum configuration? That's a choice, I suppose, but the
| outcome wasn't exactly hard to foresee.
| sophrocyne wrote:
| There are already a number of local, inference options that are
| (crucially) open-source, with more robust feature sets.
|
| And if the defense here is "but Auto1111 and Comfy don't have as
| user-friendly a UI", that's also already covered.
| https://github.com/invoke-ai/InvokeAI
| GaggiX wrote:
| Also just Krita with the diffusion AI plugin:
| https://github.com/Acly/krita-ai-diffusion
| blehn wrote:
| No idea whether or not the UI is user-friendly, but the
| installation steps alone for InvokeAI are already a barrier for
| 99.9% of the world. Not to say Noiselith couldn't be open-
| source, but it's clearly offering something different from
| InvokeAI.
| internet101010 wrote:
| I switched to InvokeAI and won't go back to basic a1111 webui.
| I like how everything is laid out, there are workflow features,
| you can easily recall _all_ properties (prompt, model, lora,
| etc.) used to generate an image, things can be organized into
| boards, and all off the boards /images/metadata are stored in a
| very well-designed sqlite database that can be tapped into via
| DataGrip.
| quitit wrote:
| automatic1111: great for the fast implementation of the most
| recent generative features
|
| comfyui: excellent for workflows and recalling the workflows,
| as they're saved into the resulting image metadata (i.e.
| sharing images, shares the image generation pipeline)
|
| InvokeAI: Great UX and community, arguably were a bit behind
| in features as they were focused on making the UI work well.
| Now at the stage of bringing in the best features of
| competitors - Like you, I can easily recommend it above all
| other options.
| squeaky-clean wrote:
| > recalling the workflows, as they're saved into the
| resulting image metadata (i.e. sharing images, shares the
| image generation pipeline)
|
| Doesn't a1111 already do this? Theres a PNG Info tab where
| you can drag and drop a PNG and it will pull all the
| prompt, inverse prompt, model, etc. And then a button to
| send it to the main generation tab. It doesn't
| automatically load the model, but that may be intentional
| because of how long it takes to change loaded models.
| holoduke wrote:
| Can you actually use those workflows in some sort of API
| from a script to automate it from lets say a python script.
| Played arround with comfy. Really nice, but i would like to
| automate it within my own environment.
| sophrocyne wrote:
| Yeah, Invoke's nodes/workflows backend can be hit via the
| API. That's how the entire front-end UI (and workflow
| editor/IDE) are built.
|
| I'm positive this can be done w/ Comfy too.
| smcleod wrote:
| Yeah invokeAI is fantastic!
| AuryGlenz wrote:
| I realize it may be good marketing, but it's odd to have the fact
| that it's on device and offline be the primary differentiator
| when that's probably how most people use Stable Diffusion
| already.
|
| I'd probably focus more on it being easy to install and use, as
| that's something that isn't done much. For me, if it doesn't have
| Controlnet, upscaling, some kind of face detailer, and preferably
| regional prompting, I'm out.
|
| I also kind of wish all of these people that want to make their
| own SD generators would instead work on one of the open source
| ones that already exist.
|
| While an app store might be a good idea, in a world with Auto111
| and all of their extensions I think it's going to go over poorly
| with the Stable Diffusion community, for what it's worth.
| michaelt wrote:
| I think there's probably a bunch of people who don't use things
| like A1111 because of the complexities of the download-this-
| which-downloads-this-which-downloads-this-then-you-manually-
| download-this-and-this setup model.
|
| I can see how something simpler might appeal to _new_ users,
| even if it doesn 't appeal to _existing_ users.
| AuryGlenz wrote:
| Sure, and I agree with that. As I said, I'd probably push
| that just as much as it being 'offline,' if not more.
| philipov wrote:
| You hit the nail on the head when you said it's good marketing,
| but go all the way. The thing you find odd tells you who they
| want to use their product; You're not their target audience.
| They are trying to convert people from using online-only
| services like Dall-E, not people who already use SD.
| prepend wrote:
| I've oddly found many cloud wrappers to stable diffusion. So I
| like the upfront on device/offline description.
|
| It was weird when I was first playing with SD how many packages
| did severe phone home or vms or whatever instead of just
| downloading a bunch of stuff and running it.
| solarkraft wrote:
| I've used SD on my device, but I found it worth it to pay for
| the hosted version because it's much faster.
| alienreborn wrote:
| Interesting, will check it out to see how it compares with
| https://diffusionbee.com which I am using for last few months for
| fun.
| janmo wrote:
| I just checked out both and Noiselith produces much, much
| better results.
| rgbrgb wrote:
| Just installed, this is very cool. Local AI is the future I want
| (and what I'm working on too). A few notes using it...
|
| Pros:
|
| - seems pretty self contained
|
| - built in model installer works really well and helps you
| download anything from CivitAI (I installed
| https://civitai.com/models/183354/sdxl-ms-paint-portraits)
|
| - image generation is high quality and stable
|
| - shows intermediate steps during generation
|
| Cons:
|
| - downloads 6.94GB SDXL model file somewhere without asking or
| showing location/size. Just figured out you can find/modify the
| location in the settings.
|
| - very slow on first generation as it loads the model, no record
| of how long generations take but I'd guess a couple minutes (m1
| max macbook, 64GB)
|
| - multiple user feedback modules (bottom left is very intrusive
| chat thing I'll never use + top right call for beta feedback)
|
| - not open source like competitors
|
| - runs 7 processes, idling at ~1GB RAM usage
|
| - non-native UX on macOS, missing hotkeys you'd expect, help
| menu. electron app?
|
| Overall 4/5 stars, would open again :)
| liuliu wrote:
| You should check out Draw Things on macOS. It works well enough
| for SDXL on 8GiB macOS devices.
| miles wrote:
| Are you the developer by any chance? If so, it would be
| helpful to state it.
| liuliu wrote:
| I am. I thought this is obvious. My statement is objective.
| I would go as far as: it is the only app works at 8GiB
| macOS devices with SDXL-family models.
| adamjc wrote:
| How would that be obvious to anyone?
| cellularmitosis wrote:
| "You should check out this thing" has a very different
| implied context than "You should check out this thing I
| made". The first sounds like a recommendation from an
| enthusiastic user, not from the the author. Because of
| this, discovering that you are the author makes your
| recommendation feel deceptive.
| liuliu wrote:
| I am sorry if you feel that way. I joined HN when it was
| a small tight-knit community without much of marketing
| presence. The "obvious" comment is more like "people know
| other people" kind of thing. I didn't try to deceive
| anyone to use the app (and why should I?).
|
| If you feel this is unannounced self-promotion, yes, it
| is, and can be done better.
|
| ---
|
| Also, for the "objective" comment, it meant to say "the
| original comment is still objective", not that you can be
| objective only by being a developer. Being a developer
| can obviously bias your opinion.
| TheHumanist wrote:
| What do you mean it was obvious? Only the developer could
| make that objective statement?
| ProfessorLayton wrote:
| Whoa, well let me just say thanks for the awesome app!!
| it's pretty entertaining to spin this up in situations
| where I don't have internet (Airplane, subway etc.)
|
| I was also surprised on how well it ran on my iPhone 11
| before I replaced it with a 15 pro.
|
| (Let me know if you're looking for some Product Design
| help/advice, totally happy to contribute pro bono. No
| worries if not of course!)
| vunderba wrote:
| Nice app - but for future reference it is _very_ much not
| obvious to any native English speaker. "You should check
| out X" sounds like a random recommendation.
| rgbrgb wrote:
| Thanks. Yeah I played with your app early on and just fired
| it up again to see the progress. Frankly I find the interface
| pretty intimidating but it is cool that you can easily stitch
| generations together.
|
| Unsolicited UX recs:
|
| - strongly recommend a default model. The list you give is
| crazy long. It kind of recommends SD 1.5 in the UI text below
| the picker but has the last one selected by default. Many of
| them are called the same thing (ironically the name is
| "Generic" lol).
|
| - have the panel on the left closed by default or show a
| simplified view that I can expand to an "advanced" view.
| Consider sorting the left panel controls by how often I would
| want to edit them (personally I'm not going to touch the
| model but it is the first thing).
|
| You are doing great work but I wouldn't underestimate the
| value of simplifying the interface for a first-time user. It
| seems to have a ton of features but I don't know what I
| should actually be paying attention to / adjusting.
|
| Is there a business model attached to this or do you have a
| hypothesis for what one might look like?
| liuliu wrote:
| Agreed on UX feedback. It accumulated a lot of crufts from
| the old technologies to the new. This just echos my early
| feedback that co-iterating UI and the technology is
| difficult, you'd better pick the side you want to be on and
| there is only one correct side (and unfortunately, the
| current app is trying hard to be on both-side).
| philote wrote:
| Another con is it only works on Silicon Macs.
| Vicinity9635 wrote:
| Apple Silicon* I presume?
|
| This could honestly be the excuse I need (want) to order an
| absolute beast of a macbook pro to replace my 2013 model.
| quitit wrote:
| If it's just for hobby/interest work, then just a heads-up
| that even the 1st generation Apple Silicon will turn over
| about one image a second with SDXL Turbo. The M3s of course
| are quite a bit faster.
|
| The performance gains in recent models and PyTorch are
| currently outpacing hardware advances by a significant
| margin, and there are still large amounts of low-hanging
| fruit in this regard.
| wayfinder wrote:
| If you want an absolute beast, especially for this stuff,
| you probably want Intel + Nvidia. Apple Silicon is a beast
| in power efficiency but a top of the line M3 does not come
| close to the top of the line Intel + Nvidia combo.
| Vicinity9635 wrote:
| Well this would just be the excuse. I'm typing this on a
| Ryzen 5950X w/32 GB of RAM and a 4090. So I guess I
| already have the beast?
| mikae1 wrote:
| _> not open source like competitors_
|
| Who are the competitors?
| quitit wrote:
| DiffusionBee: AGPL-3.0 license (Native app)
|
| InvokeAI: Apache license 2.0 (web-browser UI)
|
| automatic1111: AGPL-3.0 license (web-browser UI)
|
| ComfyUI:GPL-3.0 license (web-browser UI)
|
| There's more, but I don't pay enough attention to it
| mikae1 wrote:
| Thanks! https://lmstudio.ai/ too. For the more technically
| inclined perhaps.
| dragonwriter wrote:
| I don't think lmstudio is competes with Stable Diffusion
| frontends, even for the technically-inclined.
| vunderba wrote:
| I'd also recommend InvokeAI, an open source offering which
| has a very nice editable canvas and is very performant with
| diffusers.
|
| https://github.com/invoke-ai/InvokeAI
| sytelus wrote:
| +1 for asking download location.
| amelius wrote:
| So it's free, but not open source.
|
| What is the catch?
| sib wrote:
| They will have a non-free (as in beer) version once they exit
| beta (per the website).
| tracerbulletx wrote:
| All the real homies use ComfyUI
| weakfish wrote:
| Elaborate?
| tracerbulletx wrote:
| I'm being kind of tongue in cheek because I understand that
| this is for just making things really easy and ComfyUI is a
| node based editor that most people would have trouble with.
| But the best UI for local SD generation that the community is
| using is https://github.com/comfyanonymous/ComfyUI
| rish wrote:
| Agreed. It's worth the learning curve for the sheer power
| you can enable your workflows. I've always wanted to toy
| around with node based architectures and this seemed quite
| easy after using A1111 extensively. The community providing
| ready to go workflows has made it quite enjoyable too.
| ttul wrote:
| If you are a programmer at heart, ComfyUI will feel very
| comfortable (pun intended). It's basically a visual
| programming environment optimized for the type of
| compositional programming that machine learning models
| desire. The next thing this space needs is someone to build
| an API hosting every imaginable model on a vast farm of
| GPUs in the cloud. Use ComfyUI and other apps to
| orchestrate the models locally, but send data to the cloud
| and benefit from sharing GPU resources far more
| efficiently.
|
| If anyone has a spare thousand hours to kill, I would build
| that and connect it up with the various front-ends
| including ComfyUI, A111, etc.. not a small amount of
| effort, but it will be rewarding.
| verdverm wrote:
| This is when I feel the 24G mem limit of the mac book/air
| liuliu wrote:
| Again, try Draw Things, it runs well for SDXL on 8GiB macOS
| devices.
| verdverm wrote:
| yeah, I know there are options, I'm more interested in
| language models than image generation anyway, so llama.cpp
| brucethemoose2 wrote:
| I would highly recommend Fooocus to anyone who hasn't tried:
| https://github.com/lllyasviel/Fooocus
|
| There are a bajillion local SD pipelines, but this one is, _by
| far_ , the one with the highest quality output out-of-the-box,
| with short prompts. Its remarkable.
|
| And thats because it integrates a bajillion SDXL augmentations
| that other UIs do not implement or enable by default. I've been
| using stable diffusion since 1.5 came out, and even having
| followed the space extensively, setting up an equivalent pipeline
| in ComfyUI (much less diffusers) would be a pain. Its like a
| "greatest hits and best defaults" for SDXL.
| liuliu wrote:
| Yeah, Fooocus is much better if you are going for the best
| local generated result. Lvmin puts all his energy into making
| beautiful pictures. Also it is GPL licensed, which is a + in my
| book.
| rvz wrote:
| Looks like a complete contraption to setup and looks very
| unpleasant to use at first glance when compared against
| Noiselith.
|
| The hundreds of python scripts and having the user to touch the
| terminal shows why something like Noiselith should exist for
| normal users rather than developers or programmers.
|
| I would rather take a packaged solution that just works over a
| bunch of scripts requiring a terminal.
| liuliu wrote:
| You have to make trade-off in software development. Fooocus
| trades on the best picture rather than the most beautiful
| interface, and also simplicity in its use. I think it is a
| good trade-off given the technology is improving at breaking-
| neck speed.
|
| Look, DiffusionBee is still maintained but still no SDXL
| support.
|
| Anyone who bet that the technology is done and it is time to
| focus on the UI is making the wrong bet.
| rgbrgb wrote:
| This project is really cool and I like the stated
| philosophy on the README. I think it's making the right
| trade-off in terms of setting useful defaults and not
| showing you 100 arcane settings. However, the installation
| is too hard. It's a student project and free so I'm not
| criticizing the author at all but I think it's a pretty
| fair and useful criticism of the software and likely a
| significant bottleneck to adoption.
| Tiberium wrote:
| Huh? It has a really simple interface, much much much simpler
| than anything else that uses SD/SDXL locally. Installation is
| also simple for Windows/Linux, don't know about macOS.
| Liquix wrote:
| installation/setup is dead simple. up and running in under 3
| minutes:
|
| git clone https://github.com/lllyasviel/Fooocus.git
|
| cd Fooocus
|
| pip3 install -r requirements_versions.txt
|
| python3 entry_with_update.py
| Filligree wrote:
| Let's see...
|
| > pip3: command not found
|
| Okay. I'll need to install it? What package might that be
| in, hmm. Moving on, I already know it's python.
|
| > /usr not writeable
|
| Guess I'll use sudo...
|
| = = =
|
| Obviously I know better than to do this, but _very few
| people would_. This is not 'dead simple'! It's only simple
| for Python programmers who are already familiar with the
| ecosystem.
|
| Now, fortunately the actual documentation does say to use
| venv. That's still not 'dead simple'; you still need to
| understand the commands involved. There's definitely space
| for a prepackaged binary.
| pixl97 wrote:
| The people that make software that does useful things,
| and the people that understand system security live on
| different planets. One day they'll meet each other and
| have a religious war.
|
| This said, it's nice when developers attempt to detect
| the executable they need and warn what package is
| missing.
| brucethemoose2 wrote:
| There are projects that set up "fat" Python executables
| or portable installs, but the problem in PyTorch ML
| projects is that the download would be many gigabytes.
|
| Additionally, some package choices depend on hardware.
|
| In the end, a lot of the more popular projects have "one
| click" scripts for auto installs, and there are some for
| Fooocus specifically, but the issue there is its not as
| visible as the main repo, and not necessarily something
| the dev wants to endorse.
| zirgs wrote:
| Or you can use Stability Matrix package manager.
| brucethemoose2 wrote:
| Yeah, VoltaML is also another excellent choice in
| stabilty matrix.
| pmarreck wrote:
| Have to build it yourself on Mac, and we all know how "fun"
| building Python projects is
| jessepasley wrote:
| Just spent about 10 minutes building it on MacBook Pro M1. I
| come with significant bias against Python projects, but
| getting Fooocus to run was very, very easy.
| pmarreck wrote:
| That's good to know!
| calamari4065 wrote:
| Is this at all usable on a CPU-only system with a ton of RAM?
| brucethemoose2 wrote:
| Not really. There is a very fast LCM model preset now, but
| its still going to be painful.
|
| SDXL in particular isn't one of those "compute light,
| bandwidth bound" models like llama (or Fooocus's own mini
| prompt expansion llm that in fact runs on the CPU).
|
| There is a repo focused on CPU-only SD 1.5.
| neilv wrote:
| Looks like the Web UI of the self-hosted install of Fooocus
| sells out the user to Google Tag Manager.
|
| Can our entire field please realize that running this
| surveillance is a bad move, and just stop doing it.
| LorenDB wrote:
| Why do we never see AMD support in these projects?
| stuckkeys wrote:
| I think it is a matter of why AMD does not support these
| projects. NVIDIA is involved everywhere. They could easy do the
| same. At least to what I have observed on the internetz.
| mg wrote:
| Would it be possible to run Stable Diffusion in the browser via
| WebGPU?
| skocznymroczny wrote:
| https://websd.mlc.ai/#text-to-image-generation-demo
| stuckkeys wrote:
| Installed it. Ran it. Generated. Slow for some reason. Deleted
| it. Looks similar to Pinokio, and that is opensource.
| dreadlordbone wrote:
| After installation, it wouldn't run on my Windows machine unless
| I granted public and private network access. Kinda tripped up
| since it says "offlilne".
| kemotep wrote:
| If you disconnected completely from the internet did it still
| run?
|
| That is completely wrong to advertise it as "offline" if it
| requires an active internet connection to run.
| tredre3 wrote:
| I had a similar experience.
|
| On the first run it downloads about 30GB of data. I don't know
| if it would work offline on subsequent runs because for me it
| never ran again without crashing!
|
| Also upon uninstallation it left behind all its data (not user
| data, mind you. But the executable itself, its python venv, its
| updater, and all the models. Uninstall basically just removed
| the shortcut in the start menu).
| m3kw9 wrote:
| Does not work at all, it needs you to go and find a "model", like
| just download it for man.
| solarkraft wrote:
| I find it interesting that it requires 16GB of RAM on Windows but
| 32 on a Mac. Unfortunately that leaves me out ...
| mthoms wrote:
| I think that's probably because RAM on Mac is shared with the
| GPU. On Windows, you need 16GB RAM _plus_ 8GB on GPU.
| stared wrote:
| I keep getting "Failed to generate. Please try again" 10 seconds
| after model loading. It is hardly helpful, as trying again gives
| the same error.
|
| Apple Silicon M1, 32GB RAM, in any case.
| stets wrote:
| definitely exciting to see more local clients come out. As
| mentioned in other comments, there are some great ones out
| already. I've used automatic1111 which is quick and doesn't
| require a ton of tuning. But it still has lots of knobs and
| options which makes it difficult initially. Fooocus is super
| quick but of course less customization.
|
| Then there's ComfyUI, the holy grail of complicated, but with
| that complication comes the ability to do so much. It is a node-
| based app that allows you to create custom workflows. Once your
| image is generated, you can pipe that "node" somewhere else and
| modify it, eg: upscale the image or do other things.
|
| I'd like to see if Noiselith or some others offer support for
| SDXLTurbo -- it came out only a few days ago but in my opinion is
| a complete game-changer. It can generate 512x512 images in ~half
| a second on consumer GPUs. The images aren't crazy quality but
| that ability to make a prompt like "fox in the woods", see it
| instantly and then add "wearing a hat" and see it instantly
| generate again is so valuable. Prior to that, I'd wait 12 seconds
| for an image. Sounds like not a big deal, but the value of being
| able to iterate so quickly makes local image gen so much more
| fun.
| evanjrowley wrote:
| How's support for AMD GPUs? I only saw Nvidia listed.
| skocznymroczny wrote:
| The main issue with AMD is that to get reasonable performance
| you need to use ROCm, and ROCm is only available on Linux. They
| started porting parts of ROCm to Windows but it's not enough to
| be usable yet, might be different in few months.
| seydor wrote:
| but what s gonna happen to all those AI valuations if we all go
| offline
| kleiba wrote:
| Sales prompt: "Young woman with blonde curls in front of a
| fantasy world background, come hither eyes, sitting with her legs
| spread, wearing a white shirt and jeans hot pants."
|
| I mean, really??
| smcleod wrote:
| Yeah that's creepy as.
| momojo wrote:
| I'm genuinely curious how many people in the open source
| community are pouring their sweat and blood into these projects
| that are, at the end of the day, enabling guys to transform
| their macbooks into insta-porn-books.
| rcoveson wrote:
| If the prompt wasn't somewhat sexual, divisive, or offensive it
| would be wide open to the chorus of "still not as good as
| midjourney/dall-e/imagen". Freedom from restriction is one of
| the main selling points.
| KolmogorovComp wrote:
| Glad I'm not the only one who found it inappropriate. Feels
| very much like a dog whistle.
| rcoveson wrote:
| What's subtle about it? In the dog whistle analogy, who are
| they who cannot hear the whistle?
|
| To me this is more like yelling "ROVER! COME HERE BOY!" at
| the top of your lungs.
| NKosmatos wrote:
| As others have stated, Local AI (completely offline after
| model/weight download) is the way to go. If I have the hardware
| why shouldn't I be able to run all these fancy software on my own
| machine?
|
| There are many great suggestions and links to other
| similar/better packages, so follow the comments for more info,
| thanks :-)
___________________________________________________________________
(page generated 2023-12-01 23:00 UTC)