[HN Gopher] Stable Diffusion XL 1.0
___________________________________________________________________
Stable Diffusion XL 1.0
Author : gslin
Score : 228 points
Date : 2023-07-26 16:48 UTC (6 hours ago)
(HTM) web link (techcrunch.com)
(TXT) w3m dump (techcrunch.com)
| skybrian wrote:
| I tried it in dreamstudio. Like all the other image generators
| I've tried, it's rubbish at drawing a piano keyboard or an
| accordion. (Those are my tests to see if it understands the
| geometry of machines.)
|
| A couple of accordion pictures do look passable at a distance.
|
| Another test: how well does it do at drawing a woman waving a
| flag?
|
| One thing that strikes me is that it generates four images at a
| time, but there is little variety. It's a similar looking woman
| wearing a similar color and style of clothing, a similar street,
| and a large American flag. (In one case drawn wrong.) I guess if
| you want variety you have to specify it yourself?
|
| AI models seem to be getting ever better in resolution and at
| portraits.
| weird-eye-issue wrote:
| Not actually released in the API unlike they said.
| MasterScrat wrote:
| It'll be "released" once the model weights show up on the repo or
| in HuggingFace... for now it's "announced"
|
| It should appear here at some point, currently only the VAE was
| added:
|
| https://huggingface.co/stabilityai
| [deleted]
| naillo wrote:
| You get access to the weights instantly if you apply for them.
| It's basically not a hurdle.
|
| (I've been having fun with this for a few days.
| https://huggingface.co/stabilityai/stable-diffusion-xl-base-...
| Not sure there's much of a difference with the 1.0 version.)
| MasterScrat wrote:
| For 1.0? Where do you apply? Or are you talking about 0.9?
| Ukv wrote:
| The ones you can apply for access to are the 0.9 weights,
| which have been available for a couple of weeks. Unless the
| SDXL 1.0 weights are also available by application somewhere
| that I'm unaware.
| taminka wrote:
| https://huggingface.co/stabilityai/stable-diffusion-xl-
| base-... :)
| nickthegreek wrote:
| It does appear to be live on Clipdrop.
|
| https://clipdrop.co/stable-diffusion
| [deleted]
| quartz wrote:
| Isn't this it? https://huggingface.co/stabilityai/stable-
| diffusion-xl-base-...
| MasterScrat wrote:
| Yes, it's now been released
| ftufek wrote:
| The release event is in like ~30 minutes on their discord,
| probably the announcement went out a bit early.
| thepaulthomson wrote:
| Midjourney is still going to be hard to beat imo. Comparing SD to
| MJ is a little unfair considering their applications and
| flexibility, but I do really enjoy the "out of the box"
| experience that comes with MJ.
| Der_Einzige wrote:
| Midjourney is destroyed by the ecosystem around stable
| diffusion, especially all the features and extensions in
| automatic1111. It's not even close
| WXLCKNO wrote:
| You still have to run midjourney through discord right? There
| isn't even an official API. Feels like a joke.
| jyap wrote:
| Different use case.
|
| I can run SDXL 1.0 offline from my home. I can't do this with
| Midjourney.
|
| A closed source model that doesn't have the limitation of
| running on consumer level GPUs will have certain advantages.
| starik36 wrote:
| What type of setup do you have at home? What type of GPU? MJ
| completes a pretty high quality photo in about a minute. Does
| SD compare?
| BrentOzar wrote:
| With an RTX 4090, you can crank out several images per
| minute, even at high resolutions.
| lee101 wrote:
| [dead]
| accrual wrote:
| It sounds like after the previous 0.9 version there was some
| refining done:
|
| > The refining process has produced a model that generates more
| vibrant and accurate colors, with better contrast, lighting, and
| shadows than its predecessor. The imaging process is also
| streamlined to deliver quicker results, yielding full 1-megapixel
| (1024x1024) resolution images in seconds in multiple aspect
| ratios.
|
| Sounds pretty impressive, and the sample results at the bottom of
| the page are visually excellent.
| Tenoke wrote:
| They have bots in their discord for generating images bases on
| user prompts. Those randomize some settings, compare candidate
| models and are used for rlhf fine-tuning and that's the main
| source of refining which will continue even after release.
| dragonwriter wrote:
| There were, IIRC, three different post-0.9 candidate models in
| parallel testing to become 1.0 recently.
| [deleted]
| latchkey wrote:
| Amazing that their examples at the bottom of the page still show
| really messed up human hands.
| k12sosse wrote:
| Hands being bad is a result of people one shotting images, you
| need to go repaint them afterwards I've found. But it'll do it
| great if you inpaint well.
| HelloMcFly wrote:
| I've personally observed that the drawing of hands in
| Midjourney and SD has been getting incrementally better release
| after release.
| latchkey wrote:
| That's why I'm amazed they picked images with totally borked
| up hands to put on their press release. Truth in advertising!
| k12sosse wrote:
| If they cherry pick the examples people would get the wrong
| idea. What I like about imagegen is your results are really
| only bound by your patience
| mynameisvlad wrote:
| Some of them look surprisingly correct, so it looks like
| there's been at least some progress on that front. I would
| assume these are among the best examples of many, many attempts
| so it still seems to be a ways off.
| RobotToaster wrote:
| Is this pre-censored like their other later models?
| Remmy wrote:
| Yes.
| naillo wrote:
| I've been playing with 0.9 and it can generate nude people so
| it seems not.
| AuryGlenz wrote:
| No. From what I've gathered was trained on human anatomy, but
| not straight up porn. What they tried for 2.0/2.1 was way too
| overdone, to the point where if I prompted "princess Zelda,"
| the generation would only look mildly like her. Presumably they
| just didn't have many images of people in the training. 1.5 and
| SDXL both work fine of that front.
|
| Fine tuners will quickly take it further, if that's what you're
| after.
| PeterStuer wrote:
| Let's see wether derived models will suffer less from the 'same
| face actor'-model response to every portrait prompt. It's not
| trivial to get photoreal models not lookalike without resorting
| to specific, typically celeb based, finetunes.
| andybak wrote:
| In the meantime I've been getting good mileage out of Kandinsky -
| anyone got a good sense of how they compare?
| brucethemoose2 wrote:
| This is the first I have heard of Kandinsky. Thanks for the
| tip.
|
| SDXL is a bigger model. There are some subjective comparison
| posts with SDXL 0.9, but I can't see them since they are on X
| :/
| simbolit wrote:
| That sounds so weird, it took me a minute to understand. Go
| to nitter.net which has no login requirement and no ads, but
| all the same content that X (tmsfkat) has.
| brucethemoose2 wrote:
| Two Nitter instances failed to load it, unfortunately.
|
| And yeah, X is weird to type out too.
| tmaly wrote:
| I will wait for the automatic1111 web ui version
| Der_Einzige wrote:
| It's already supported in automatic1111 (see recent updates),
| and someone in the community will convert it to the
| automatic1111 format within minutes/hours after it's released
| on huggingface.
| SV_BubbleTime wrote:
| Sort of. IIRC (which may be unlikely) Auto1111 has the base
| model in the text to image plane, but if you want to use the
| refiner that is a separate IMG2IMG step/tab. Which would be a
| pain in the ass imo.
|
| The "Comfy" tool is node based and you can string both
| together which is nice. Although if you aren't confident in
| your images you don't need the refiner for a bit.
| brucethemoose2 wrote:
| I think the diffusers UIs (like Invoke and VoltaML) are
| going to implement the refiner soon since HF already has a
| pipeline for it.
|
| Comfy and A1111 are based around the original SD
| StabilityAI code, but the implementation must be pretty
| similar if they could add the base model so quickly.
| dragonwriter wrote:
| Work started with the SDXL 0.9 release, and for A1111 it
| exited release candidatr status in the last few days.
| [deleted]
| seydor wrote:
| whats the memory usage of sdxl ?
| k12sosse wrote:
| Runs great on my 10GB 3080 FTW3. ComfyUI moreso than
| auto1111.
| SV_BubbleTime wrote:
| It depends greatly on your UI and the size you are
| generating. There is no hard answer.
| andrewmunsell wrote:
| Been working fine on a 8 GB 3070 generating 1024x1024
| images, using Comfy UI with the refiner
| brucethemoose2 wrote:
| TBH I was hoping the community would take the opportunity to
| move to the diffusers format...
|
| You get deduplication, easy swapping of stuff like VAEs, faster
| loading, and less ambiguity about what exactly is inside a
| monolithic .safetensors file. And this all seems more important
| since SDXL is so big, and split between two models anyway.
| mt3ck wrote:
| Is there anything like this for the vector landscape?
|
| This may just be due to the iterative denoising approach a lot of
| these models take but they only seem to work well when creating
| raster style images.
|
| In my experience when you ask them to create logos, shirt
| designs, illustrations, they tend to not work as well and
| introduce a lot of artifacts, distortions, incorrect spellings
| etc.
| orbital-decay wrote:
| If you mean raster images that look like vector and contain
| arbitrary text and shapes, controlnets/T2I adapters do work for
| this. You could train your custom controlnet for this, too. (it
| requires understanding)
|
| As for directly generating vector images, there's nothing yet.
| Your best bet is generating vector-looking raster and tracing
| it.
| cheald wrote:
| A lot of people are having success by adding extra networks
| (lora is the most common) which are trained on the type of
| image you're looking for. It's still a raster image, of course,
| but you can produce images which look very much like
| rasterizations of vector images, which you can then translate
| back into SVGs in Inkscape or similar.
| jrflowers wrote:
| I hope someday there's a version of this or something comparable
| to it that can run on <8gb consumer hardware. The main selling
| point of Stable Diffusion was its ability to run in that
| environment.
| naillo wrote:
| You can do this if you select the
| `pipe.enable_model_cpu_offload()` option. See this
| https://huggingface.co/stabilityai/stable-diffusion-xl-base-...
| minsc_and_boo wrote:
| I feel like this is the greatest demand for LLMs at the moment
| too.
|
| It's hard to believe we're only 8 months into this industry, so
| I imagine we'll start seeing smaller footprints soon.
| simbolit wrote:
| 8 months from what point?
|
| Gpt3 is 36 months old. Dalle-e is 28 months old. Even
| StableDiffusion is like 11 months old.
| brucethemoose2 wrote:
| We already do. MLC-LLM and Llama.cpp have Vulkan/OpenCL/Metal
| 3 bit implementations. That can run llama 7B (or maybe even
| 13b?) in 8GB.
|
| TBH devices just need more ram for coherent output though.
| Llama 13b and 33b are so much "smarter" and more coherent
| than 7B with 3 bit quant.
| liuliu wrote:
| SDXL 0.9 runs on iPad Pro 8GiB just fine.
| JeffeFawkes wrote:
| Is this using Draw Things, or another app? Did you have to
| quantize the model first?
| liuliu wrote:
| Yeah, Draw Things. It will be submitted as soon as SDXL
| v1.0 weights available. Quantized model _should_ run on
| iPhones (4GiB / 6GiB models), but we haven't done that
| yet. So no, these are just typical FP16 weights on iPad.
| JeffeFawkes wrote:
| Thanks! I guess I'll stick to running it on my Macbook
| for the time being until the quantized model gets
| uploaded. What kind of performance are you seeing with
| the FP16 weights on the iPad? I've run a few SD2.0-based
| (unquantized) models on my 2020 iPad Pro but it seems
| like it gets thermally throttled after a while.
| liuliu wrote:
| Will be more info upon release. SDXL v0.9 performs
| generally the same as SD v1 / v2 on the same resolution.
| But because you tend to run it at larger resolution, you
| might feel it slower.
| capybara_2020 wrote:
| Give InvokeAI a try.
|
| https://github.com/invoke-ai/InvokeAI
|
| Edit: Spec required from the documentation
|
| You will need one of the following: An
| NVIDIA-based graphics card with 4 GB or more VRAM memory. 6-8
| GB of VRAM is highly recommended for rendering using the Stable
| Diffusion XL models An Apple computer with an M1 chip.
| An AMD-based graphics card with 4GB or more VRAM memory (Linux
| only), 6-8 GB for XL rendering.
| brucethemoose2 wrote:
| There are several papers on 4/8 bit quantization, and a few
| implementations for Vulkan/CUDA/ROCm compilation.
|
| TBH the UIs people run for SD 1.5 are pretty unoptimized.
| dragonwriter wrote:
| > I hope someday there's a version of this or something
| comparable to it that can run on <8gb consumer hardware.
|
| Someday is today: from the official announcement: "SDXL 1.0
| should work effectively on consumer GPUs with 8GB VRAM or
| readily available cloud instances."
| https://stability.ai/blog/stable-diffusion-sdxl-1-announceme...
| jamesdwilson wrote:
| Still can't draw hands correctly it looks like.
| naillo wrote:
| I can't say for 1.0 but in 0.9 hands get fairly often rendered
| perfectly. It's not always right but it's way better than any
| other earlier release (where it's usually _consistently_
| wrong).
| mkaic wrote:
| The official blog post from Stability is finally up and would
| probably be a better URL to link to than the TechCrunch coverage:
| https://stability.ai/blog/stable-diffusion-sdxl-1-announceme...
| badwolf wrote:
| Can SD draw hands finally?
| GaggiX wrote:
| You can already easily generate images with good looking hands
| if you use a good custom model.
| ShamelessC wrote:
| I thought this release had been announced already? Or was that
| not 1.0? Could have sworn they released an "XL" variant a little
| while ago?
| GaggiX wrote:
| It was the research weights of the v0.9 model
| amilios wrote:
| I always wondered why the vision models don't seem to be
| following the whole "scale up as much as possible" mantra that
| has defined the language models of the past few years (to the
| same extent). Even 3.5 billion parameters is absolutely nothing
| compared to the likes of GPT-3, 3.5, 4, or even the larger open-
| source language models (e.g. LLaMA-65B). Is it just an
| engineering challenge that no one has stepped up for yet? Is it a
| matter of finding enough training data for the scaling up to make
| sense?
| brucethemoose2 wrote:
| Diffusion is relatively compute intensive compared to
| transformers llms, and (in current implementation) doesn't
| quantize as well.
|
| A 70B parameter model would be very slow and vram hungry, hence
| very expensive to run.
|
| Also, image generation is more reliant on tooling surrounding
| the models than pure text prompting. I dont think even a 300B
| model would get things quite right through text prompting
| alone.
| airgapstopgap wrote:
| Diffusion is more parameter-efficient and you quickly saturate
| the target fidelity, especially with some refiner cascade. It's
| a solved problem. You do not need more than maybe 4B total.
| Images are far more redundant than text.
|
| In fact, most interesting papers since Imagen show that you get
| more mileage out of scaling the text encoder part, which is, of
| course, a Transformer. This is what drives accuracy, text
| rendering, compositionality, parsing edge cases. In SD 1.5 the
| text encoder part (CLIP ViT-L/14) takes a measly 123M
| parameters.[1] In Imagen, it was T5-XXL with 4.6B [2]. I am
| interested in someone trying to use a _really strong_ encoder
| baseline - maybe from a UL2-20B - to push this tactic further.
|
| Seeing as you can throw out diffusion altogether and synthesize
| images with transformers [3], there is no reason to prioritize
| the diffusion part as such.
|
| 1. https://forums.fast.ai/t/stable-diffusion-parameter-
| budget-a...
|
| 2. https://arxiv.org/abs/2205.11487
|
| 3. https://arxiv.org/abs/2301.00704
| ShamelessC wrote:
| > Seeing as you can throw out diffusion altogether and
| synthesize images with transformers [3]
|
| That's actually how this whole party got started. DALL-E (the
| first one) was a transformer model trained on image tokens
| from an early VAE (and text tokens ofc). Researchers from
| CompVis developed VQGAN in response. OpenAI showed improved
| fidelity with guided diffusion over ImageNet (classes) and
| subsequently DALLE2 using pixel space diffusion and cascading
| up sampling. CompVis responded with Latent Diffusion which
| used diffusion in the latent space of some new VQGANs.
|
| The paper you mention is interesting! They go back to the
| DALL-E 1 method but train two VQGAN's for upsampling and
| increase the parameter count. This is faster, but only faster
| than originally reported benchmarks using inferior sampling
| methods for their diffusion. I would be curious if they can
| beat some of the more recent ones which require as few as
| 10-20 steps.
|
| They also improve on FID/CLIP scores likely by using more
| parameters. This might be a memory/time trade off though. I
| would be curious how much more VRAM their model requires
| compared to SD, MJ, Kandinsky.
|
| The same goes for using T5-XXL. You'll win FID score contests
| but no one will be able to run it without an A100 or TPU pod.
| airgapstopgap wrote:
| > The same goes for using T5-XXL
|
| Is this still true in 2023? Sure, back in the dark ages it
| seemed like a 860M model is just about the limit for a
| regular consumer, but I don't see why we wouldn't be able
| to use quantized encoders; and even 30B LLMs run okay on
| Macbooks now.
| Etherlord87 wrote:
| > Images are far more redundant than text.
|
| "A picture is worth a thousand words" - I wonder how
| (in)accurate this popular saying turned out to be? :D
| elpocko wrote:
| I'm gonna go ahead and say in 2023, one detailed picture
| (512x512) is worth about 30 words.
| SketchySeaBeast wrote:
| I guess that depends on the prompt.
| k12sosse wrote:
| Do negative prompt tokens count as words?
| naillo wrote:
| They often reference this paper as the motivation for that
| https://arxiv.org/pdf/2203.15556.pdf I.e. training with 10x
| data and 10x longer can yield as good models as a gpt-3 model
| but with fewer weights (according to the paper) and the same
| principle applies in vision.
| lacker wrote:
| I'm out of date on the image-generating side of AI, but I'd like
| to check things out. What's the best tool for image generation
| that's available on a website right now? Ie, not a model that I
| have to run locally.
| gfosco wrote:
| [flagged]
| PUSH_AX wrote:
| Midjourney right? Although, discord isn't a website I guess.
| iambateman wrote:
| Probably Midjourney, but I like Dreamstudio better.
| [deleted]
| a5huynh wrote:
| If you want to play around with Stable Diffusion XL:
| https://clipdrop.co
| dash2 wrote:
| I just tried this and the UI is very nice (better than
| dreamstudio), with nice tool integration, and image quality
| is definitely going up with each new release. You can see a
| few results at fb.com/onlyrolydog (along with a lot of other
| canine nonsense).
| esperent wrote:
| Since clipdrop hs an API is there any way to use it with
| ComfyUI or Automatic111 (or whatever that's called).
| the_lonely_road wrote:
| https://playgroundai.com/create
|
| Not affiliated in anyway and not very involved in the space. I
| just wanted to generate some images a few weeks ago and was
| looking for somewhere I could do that for free. The link above
| lets you do that but I suggest you look up prompts because its
| a lot more involved than I expected.
| aaarrm wrote:
| Any particularly useful resources for looking into prompts?
| the_lonely_road wrote:
| I used this: https://learnwithnaseem.com/best-playground-
| ai-prompts-for-a...
|
| I just took the ones I liked and then deleted out the words
| that were specific to that image and left the ones that
| were providing the style of the image. So for example on
| the first one I would delete "an cute kitsune in florest"
| but would keep "colorfully fantast concept art". Then I
| just added a comma separated list of the of the features I
| wanted in my picture. It took a lot more trial and error
| than I thought and adding sentences seemed to be worse than
| just individual words. I am sure I barely scratched the
| surface of interfacing with the tool correctly but the
| space is moving so fast its not the kind of thing I want to
| spend my time learning right now just to have that
| knowledge deprecate in 6 months.
| brucethemoose2 wrote:
| This AI Horde UI has, IMO, some really good templates and
| suggestions:
|
| https://tinybots.net/artbot
| knicholes wrote:
| I've found https://firefly.adobe.com/ pretty good at composing
| images with multiple subjects. [disclaimer - I work at Adobe,
| but not in the Creative Cloud]
|
| But I wouldn't say it's the "best." Just trained on images that
| weren't taken from unconsenting artists.
| adzm wrote:
| I'm actually a big fan of firefly. It has a different kind of
| style from the others, presumably due to its training
| dataset?
| roborovskis wrote:
| https://dreamstudio.ai/
| esperent wrote:
| What models does dreamstudio use? I couldn't see how to view
| them without logging in.
| vouaobrasil wrote:
| This explosion of AI-generated imagery will result in an
| explosion of millions of fake images, obivously. Perhaps in the
| short-term this is fun, but in the long-term, we will lose a bit
| more scarcity, which is not that great in my opinion.
|
| Isn't the best part of a meal eating after you've not had
| anything to eat for a while? The best part about a kiss that
| you've quenched the pain of missing your partner?
|
| The best part of art is that you haven't seen anything good in a
| while?
|
| Scarcity is an underappreciated gift to us, and the relative
| scarcity per capita is in a sense what drives us to connect with
| other people, so that we may be priveleged to witness the
| occasional spark of creativity from a person, which in turn tells
| us about that person.
|
| Although that sort of viewpoint has been declining for some time
| due to the intensely capitalistic squeezing of every sort of
| human endeavor, AI brings this to a whole new level.
|
| I think if those making this software thought a bit about this,
| they might second-guess whether it is truly right to release it.
| Just a thought.
| soligern wrote:
| Enforcing artificial scarcity is idiotic and counter
| progressive. There will be other things that will continue to
| be uncommon that humans will continue to appreciate. This is
| what human progress looks like. Imagine someone said this when
| agriculture started up- "The great thing about fruits and
| vegetables is that they taste so sweet the few times we find
| them. We shouldn't grow them in bulk"
| pzo wrote:
| A lot of downvotes. I can relate to it a little bit. During
| beginning of covid I was in SE Asia at airbnb that didn't have
| laundry machine - since in SE Asia you don't need it generally
| because there is so many cheap per kg laundry services around.
| When for the first month I had to hand wash my clothes I really
| appreciated having a laundry machine after moving to another
| airbnb that had one - you take some things for granted.
|
| But no, I wouldn't want to hand wash my laundry more often. For
| the same reason probably I still prefer using a lighter when
| having a BBQ than a flint.
| [deleted]
| dwallin wrote:
| I think you have this backwards, Capitalism loves scarcity.
| Scarcity is what allows for supply and demand curves and
| profit-making opportunities, even better if you can control the
| scarcity. Capitalist entities are constantly attempting to use
| laws, technology, and market power to add scarcity to places
| where it didn't previously exist.
| SV_BubbleTime wrote:
| I'm pretty sure that lots of things end up being scarce in
| totalitarian forms of government... food, for one.
| Gabriel_Martin wrote:
| Seeing the "less art needs to exist" perspective is certainly a
| first time for me on this topic.
| naillo wrote:
| The same could have been said when photoshop or CGI tools like
| blender replaced hand sculpting and hand painting but I think
| it hasn't been a net negative across the board (I think rather
| the opposite).
| RcouF1uZ4gsC wrote:
| I want to appreciate your comment, but I can't.
|
| Can you please chisel it on stone tablets for me?
|
| That will really help me appreciate it.
| naillo wrote:
| Stability AI is awesome I love them
| freediver wrote:
| I am completely uninformed in this space.
|
| Would someone be kind to explain what the current state of the
| art in image generation is (how does this compare to Midjourney
| and others)?
|
| How do open source models stack up?
|
| Also what are the most common use cases for image generation?
| [deleted]
| liuliu wrote:
| SDXL 0.9 should be the state-of-the-art image generation model
| (in the open). It generates at 1024x1024 large resolution, with
| high coherency and good selection of styles out of box. It also
| has reasonable text-understanding comparing to other models.
|
| That has been said, based on the configurations of these
| models, we are far from saturating what the best model can do.
| The problem is, FID is terrible metrics to evaluating these
| models so like LLM, we are a bit clueless about how to evaluate
| them now.
| GaggiX wrote:
| Why do you think FID is a terrible metrics? What don't you
| like in particular about it?
| liuliu wrote:
| I overspoke. FID is a fine metrics to observe the training
| progress of your own model. And it correlates well with
| some coherency issues of generative models. But for cross
| model comparisons, especially for models that generally do
| well under FID, it is not discriminative enough to separate
| better / good.
| sdflhasjd wrote:
| For bland stock photos and other "general-purpose" image
| generation, DALLE-2/Bing/Adobe etc are... the okayest. SD (with
| just standard model weights) is particularly weak here because
| of the small model size.
|
| If you want to get arty, then state of the art for out-of-the-
| box typing in a prompt and clicking "generate" is probably
| MidJourney.
|
| But if you're willing to spend some more time playing around
| with the open-source tooling, community finetunes, model
| augmentations (LyCORIS, etc), SD is probably going to get you
| the farthest.
|
| > Also what are the most common use cases for image generation?
|
| By sheer number of image generations? Take a guess...
| orbital-decay wrote:
| SDXL is in roughly the same ballpark as MJ 5 quality-wise, but
| the main value is in the array of tooling immediately available
| for it, and the license. You can fine-tune it on your own
| pictures, use higher order input (not just text), and daisy-
| chain various non-imagegen models and algorithms
| (object/feature segmentation, depth detection, processing,
| subject control etc) to produce complex images, either
| procedural or one-off. It's all experimental and very
| improvised, but is starting to look like a very technical CGI
| field separate from the classic 3D CGI.
| brucethemoose2 wrote:
| Midjourney may be better for plain prompts, but Stable
| Diffusion is SOTA because of the tooling and finetuning
| surrounding it.
| hospitalJail wrote:
| Idk Midjourny ignores prompts.
|
| For the longest time I thought it was google imaging things
| and doing some photoshop to make things look like Pixar
| because it was so bad.
___________________________________________________________________
(page generated 2023-07-26 23:01 UTC)