[HN Gopher] Comparing Adobe Firefly, Dalle-2, and OpenJourney
___________________________________________________________________
Comparing Adobe Firefly, Dalle-2, and OpenJourney
Author : muhammadusman
Score : 123 points
Date : 2023-06-20 17:16 UTC (5 hours ago)
(HTM) web link (blog.usmanity.com)
(TXT) w3m dump (blog.usmanity.com)
| og_kalu wrote:
| Should be compared using Bing Image Creator(better version of
| dall-e) rather than the Dalle-2 site.
| pdntspa wrote:
| Why didnt this person include Stable Diffusion?
| qiller wrote:
| OpenJourney is fine tuned SD
| soligern wrote:
| [flagged]
| dvt wrote:
| Adobe Firefly is actually extremely competent, especially since
| it doesn't use copyrighted images in its training set. Using
| MidJourney (which is fantastic) commercially will be a quagmire
| for the unlucky company that draws a lawsuit.
| personjerry wrote:
| The analysis at the end seems to be lacking. From my perspective,
| PhotoShop and Midjourney come out on top in terms of aesthetic
| and accuracy, with kouteiheika's Stable Diffusion results[0] a
| close second. Dall-E falls far behind, which makes sense
| considering all the work that's gone in to the other systems to
| fine-tune and build ecosystems around them.
|
| [0]: https://news.ycombinator.com/item?id=36408744
| FanaHOVA wrote:
| I had done a similar comparison a couple months back but used
| Lexica instead of DALL-E.
|
| Seems clear to me that Midjourney has by far the best "vibes"
| understanding. Most models get the items right but not the
| lighting. Firefly seems focused on realism which makes sense for
| a photography audience.
|
| https://twitter.com/fanahova/status/1639325389955952640?s=46...
| Skywalker13 wrote:
| And here with BlueWillow https://www.bluewillow.ai/
|
| 1:
| https://media.discordapp.net/attachments/1060989219432054835...
|
| 2:
| https://media.discordapp.net/attachments/1060989219432054835...
|
| 3:
| https://media.discordapp.net/attachments/1060989219432054835...
| kj_setup wrote:
| Seems a lot better than some of the ones in the post
| snowe2010 wrote:
| not sure this is a good comparison. midjourney likes much shorter
| prompts, and honestly they're all absolutely terrible for
| anything that isn't 'photo' based. E.g. ask it to generate a word
| bubble of the most common programming languages and it will fail
| every time, no matter what you try. I love it for photo stuff,
| but for photoshop you'd expect it to be able to do other things
| as well.
| capybara_2020 wrote:
| Curious, midjourney does great art and cartoon/comic styles
| too. Not just realistic images.
|
| Most image AI tools are terrible with words.
|
| I am curious, what images did you try generating with
| midjourney?
| jw1224 wrote:
| That's not a fair comparison, as Midjourney is outstanding at a
| wide range of styles beyond photography.
|
| Generating a "word bubble" is going to look terrible in every
| major diffusion model. Cohesive words and writing in _image_
| models is still highly specialised.
| abeppu wrote:
| Is it intentional that each of the prompts is given twice in that
| blockquote? It's done without a space, so e.g. in the 2nd
| example, the word "centeredvalley" appears because of the way the
| last/first words of the first/second repetition were mashed
| together. Does that indicate what was actually given to the
| engines, or was that a copy-paste issue made only while putting
| together the article? I could imagine that non-words like
| "cornera" in the last example could throw things off?
| whatscooking wrote:
| I like how simple Firefly's images are, like something you'd want
| to work with in Photoshop. Dalle-2 looks terrible. Midjourney is
| still my favorite.
| chankstein38 wrote:
| As someone who has spent hours playing with it in Photoshop
| (Beta) Firefly is actually pretty damned cool!
| rgbrgb wrote:
| For those curious, I tried the same prompts with Kandinsky 2.1
| [0]. In my experience it kind of blends the conceptual
| understanding of DALL-E with the higher quality image generation
| of Stable Diffusion. Like Midjourney though it kind of injects
| it's own style and allows you to get "satisfying" results from
| short prompts.
|
| The flaw with these comparisons is that you really shouldn't use
| the same prompt with different generators. If you want to get
| best results you do have to play with the prompts and do a bunch
| of iteration to kind of explore the latent space and find what
| you're looking for. The first super long prompt looks like it's
| tuned for stable diffusion for instance. Different generators
| also have different syntax (e.g. with stable diffusion you can
| surround a phrase with parens to give it extra emphasis).
|
| [0]: https://iterate.world/s/clj4n19u20000jv08iqygiaqw
| SoKamil wrote:
| Can we appreciate how well that lightbox works on this site in a
| mobile mobile browser, especially Safari? Also the gestures are
| smooth and do not cause any quirks like unintended refresh
| gesture
| muhammadusman wrote:
| Author here: I updated the post to include the generated results
| from Stable Diffusion and Midjourney (thanks to kouteiheika and
| mdorazio).
| cainxinth wrote:
| Amazing how quickly Dalle-2 went from among the best image
| transformers to among the worst.
| capybara_2020 wrote:
| It might be a case of them seeing way more potential with LLMs
| compared to image generation.
| gwern wrote:
| The stagnation has been very curious. They are part of a large
| & generally competent org, which otherwise has remained far
| ahead of the competition, like GPT-4. Except... for DALL-E 2,
| where it did not just stagnate for over a year (on top of its
| bizarre blindspots like garbage anime generation), but actually
| seemed to get _worse_. They have an experimental model of some
| sort that some people have access to, but even there, it 's
| nothing to write home about compared to the best models like
| Parti or eDiff-I etc.
| sebzim4500 wrote:
| I think they just don't care very much about DALL-E.
|
| Which is fair enough, when you are a (relatively) small
| company competing with the likes of Google and Meta you
| really need to focus.
| og_kalu wrote:
| Nobody is able to use Parti or eDiff. Compared to models you
| can use, the experimental Dall-e or Bing Image Creator is
| second only to midjourney in my experience.
| Sharlin wrote:
| I haven't tried those two, but I'd be surprised if they
| were better than Stable Diffusion. Which is free, runnable
| (and trainable!) locally, and already has a large ecosystem
| of frontends, tweaks and customized models.
| og_kalu wrote:
| Believe me, i know all about SD's possible customization
| and tweaks.
|
| I would still easily put both ahead of the base models.
| You won't match the quality of those models without
| finetuning. When you do fine-tune, it'll be for a
| particular aesthetic and you won't match them in terms of
| prompt understanding and adherence.
| TeMPOraL wrote:
| I suspect that they consider txt2img to be more of a
| curiosity now. Sure, it's transformative; it's going to upend
| whole markets (and make some people a lot of money in the
| process) - however, it's _just_ producing images. Contrast
| with LLMs, which have already proven to be generally
| applicable in great many domains, and that if you squint, are
| probably capturing the basic mechanisms of _thinking_. OpenAI
| lost the lead in txt2img, but GPT-4 is still way ahead of
| every other LLM. It makes sense for them to focus pretty much
| 100% on that.
| og_kalu wrote:
| Dall-e experimental is very good (Bing Image creator). I only
| prefer midjourney to it.
| hathym wrote:
| chatgpt next...
| ralusek wrote:
| Dall-E 2 was almost immediately displaced by MidJourney.
| Nothing comes close to even GPT 3.5 at the moment.
| sebzim4500 wrote:
| Anthropic's models are better than GPT 3.5 in my opinion.
| denverllc wrote:
| Why innovate when you can regulate?
| flangola7 wrote:
| https://time.com/6288245/openai-eu-lobbying-ai-act/
| Applejinx wrote:
| I don't know, what I saw in there (particularly with the
| haunted house) was a far broader POTENTIAL RANGE of outputs. I
| get that they were cheesier outputs, but it seems to me that
| those outputs were just as capable of coming from the other
| 'AIs'... if you let them.
|
| It's like each of these has a hidden giant pile of negative
| prompts, or additional positive prompts, that greatly narrow
| down the range of output. There are contexts where the Dall-E
| 'spoopy haunted house ooooo!' imagery would be exactly right...
| like 'show me halloweeny stock art'.
|
| That haunted house prompt didn't explicitly SAY 'oh, also make
| it look like it's a photo out of a movie and make it look
| fantastic'. But something in the more 'competitive' AIs knew to
| go for that. So if you wanted to go for the spoopy cheesey
| 'collective unconscious' imagery, would you have to force the
| more sophisticated AIs to go against their hidden requirements?
|
| Mind you if you added 'halloween postcard from out of a cheesey
| old store' and suddenly the other ones were doing that vibe six
| times better, I'd immediately concede they were in fact that
| much smarter. I've seen that before, too, in different Stable
| Diffusion models. I'm just saying that the consistency of
| output in the 'smarter' ones can also represent a thumb on the
| scale.
|
| They've got to compete by looking sophisticated, so the 'Greg
| Rutkowskification' effect will kick in: you show off by picking
| a flashy style to depict rather than going for something
| equally valid, but less commercial.
| jsnell wrote:
| It's not just about the haunted house. Just look at the
| DALLE-2 living room pictures closely. None of it makes any
| sense. And we're not even talking of subtle details, all of
| the first three pictures have a central object that the eye
| should be drawn to that's just a total mess. (The table
| that's being subsumed by a bunch of melting brown chairs in
| the first one, the i-don't-even-know-what that seems to be
| the second picture, and the whatever-this-is on the blue
| carpet.)
| kouteiheika wrote:
| For reference, here's what you can get with a properly tweaked
| Stable Diffusion, all running locally on my PC. Can be set up on
| almost any PC with a mid range GPU in a few minutes if you know
| what you're doing. I didn't do any cherry picking; this is the
| first thing it generated. 4 images per prompt.
|
| 1st prompt: https://i.postimg.cc/T3nZ9bQy/1st.png
|
| 2nd prompt: https://i.postimg.cc/XNFm3dSs/2nd.png
|
| 3rd prompt: https://i.postimg.cc/c1bCyqWR/3rd.png
| senko wrote:
| I am sure you're right, but "if you know what you're doing"
| does a lot of heavy lifting here.
|
| We could just as easily say "hosting your own email can be set
| up in a few minutes if you know what you're doing". I could do
| that, but I couldn't get local SD to generate comparable images
| if my life depended on it.
| caseyf wrote:
| If you have an apple device, there is free GUI for Stable
| Diffusion called "Draw Things. It is nice and it just works.
| https://apps.apple.com/us/app/6444050820
|
| screenshot of the options interface:
| https://stash.cass.xyz/drawthings-1687292611.png
| muhammadusman wrote:
| thanks for doing this, I would like to include these into the
| blog post as well. Can I use these and credit you for them?
| (let me know what you'd like linked)
| kouteiheika wrote:
| Sure. No need to credit me.
| muhammadusman wrote:
| thanks, updated the post with your results as well :)
| [deleted]
| ewjt wrote:
| Can you elaborate on "properly tweaked"? When I use one of the
| Stable Diffusion and AUTOMATIC1111 templates on runpod.io, the
| results are absolutely worthless.
|
| This is using some of the popular prompts you can find on sites
| like prompthero that show amazing examples.
|
| It's been serious expectation vs. reality disappointment for me
| and so I just pay the MidJourney or DALL-E fees.
| capybara_2020 wrote:
| First off are you using a custom model or the default SD
| model? The default model is not the greatest. Have you tried
| controlnet?
|
| But yes SD can be a bit of a pain to use. Think of it like
| this. SD = Linux, Midjourney = Windows/MacOS. SD is more
| powerful and user controllable but that also means it has a
| steeper learning curve.
| orbital-decay wrote:
| Are you using txt2img with the vanilla model? SD's actual
| value is in the large array of higher-order input methods and
| tooling; as a tradeoff, it requires more knowledge. Similarly
| to 3D CGI, it's a highly technical area. You don't just enter
| the prompt with it.
|
| You can finetune it on your own material, or choose one of
| the hundreds of public finetuned models. You can guide it in
| a precise manner with a sketch or by extracting a pose from a
| photo using controlnets or any other method. You can
| influence the colors. You can explicitly separate prompt
| parts so the tokens don't leak into each other. You can use
| it as a photobashing tool with a plugin to popular image
| editing software. Things like ComfyUI enable extremely
| complicated pipelines as well. etc etc etc
| nomand wrote:
| Is there a coherent resource (not a scattered 'just google
| it' series of guides from all over the place) that
| encapsulates some of the concepts and workflows you're
| describing? What would be the best learning site/resource
| for arriving at understanding how to integrate and
| manipulate SD with precision like that? Thanks
| kouteiheika wrote:
| > What would be the best learning site/resource for
| arriving at understanding how to integrate and manipulate
| SD with precision like that?
|
| Honestly? Probably YouTube tutorials.
| TeMPOraL wrote:
| Jaysus.
|
| I'm going to sound like an entitled whiny old guy
| shouting at clouds, but - what the hell; with all the
| knowledge being either locked and churned on Discord, or
| released in form of YouTube videos with no transcript and
| extremely low content density - how is anyone with a job
| supposed to keep up with this? Or is that a new form of
| gatekeeping - if you can't afford to burn a lot of time
| and attention as if in some kind of Proof of Work scheme,
| you're not allowed to play with the newest toys?
|
| I mean, Discord I can sort of get - chit-chatting and
| shitposting is easier than writing articles or
| maintaining wikis, and it kind of grows organically from
| there. But YouTube? Surely making a video takes 10-100x
| the effort and cost, compared to writing an article with
| some screenshots, while also being 10x more costly to
| consume (in terms of wasted time and strained attention).
| How does _that_ even work?
| bavell wrote:
| I've been playing with SD for a few months now and have
| only watched 20-30m of YT videos about it. There's only a
| few worth spending any time watching, and they're on
| specific workflows or techniques.
|
| Best just to dive in if you're interested IMO. Otherwise
| you'll get lost in all the new jargon and ideas. Great
| place to start is the A1111 repo, lot of community
| resources available and batteries included.
| orbital-decay wrote:
| How does anyone keep up with anything? It's a visual
| thing. A lot of people are learning drawing, modeling,
| animation etc in the exact same way - by watching YouTube
| (a bit) and experimenting (a lot).
| TeMPOraL wrote:
| Picking images from generated sets is a visual thing.
| Tweaking ControlNet might be too (IDK, I've never got a
| chance to use it - partly because of what I'm whining
| about here). However, writing prompts, fine-tuning
| models, assembling pipelines, renting GPUs, figuring out
| which software to use for what, where to get the weights,
| etc. - _none_ of this is visual. It 's pretty much
| programming and devops.
|
| I can't see how covering this on YouTube, instead of (vs.
| in addition to) writing text + some screenshots and
| diagrams, makes any kind of sense.
| kouteiheika wrote:
| I mostly agree, but in this case it can be genuinely
| useful to actually _see_ the process of someone using the
| tool effectively.
| sorenjan wrote:
| Take a moment and go scroll through the examples at
| civitai.com. Does most of them strike you as something by
| people with jobs? Most of them are pretty juvenile, with
| pretty women and various anime girls.
| sebzim4500 wrote:
| Are you under the impression that people with jobs don't
| like pretty women and anime girls?
| kaitai wrote:
| An operative word here is people.... the set "people with
| jobs" contains a far higher fraction of folks who like
| attractive men than is represented here....
| sorenjan wrote:
| Of course not, but it looks like a teenage boy's room.
| bavell wrote:
| ComfyUI is a nice complement to A1111, the node-based
| editor is great for prototyping and saving workflows.
| kouteiheika wrote:
| > Can you elaborate on "properly tweaked"?
|
| In a nutshell:
|
| 1. Use a good checkpoint. Vanilla stable diffusion is
| relatively bad. There are plenty of good ones on civitai.
| Here's mine: https://civitai.com/models/94176
|
| 2. Use a good negative prompt with good textual inversions.
| (e.g. "ng_deepnegative_v1_75t", "verybadimagenegative_v1.3",
| etc.; you can download those from civitai too) Even if you
| have a good checkpoint this is essential to get good results.
|
| 3. Use a better sampling method instead of the default one.
| (e.g. I like to use "DPM++ SDE Karras")
|
| There are more tricks to get even better output (e.g.
| controlnet is amazing), but these are the basics.
| renewiltord wrote:
| Thank you. I assume there's some community somewhere where
| people discuss this stuff. Do you know where that is? Or
| did you just learn this from disparate sources?
| kouteiheika wrote:
| > I assume there's some community somewhere where people
| discuss this stuff. Do you know where that is? Or did you
| just learn this from disparate sources?
|
| I learned this mostly by experimenting + browsing civitai
| and seeing what works + googling as I go + watching a few
| tutorials on YouTube (e.g. inpainting or controlnet can
| be tricky as there are a lot of options and it's not
| really obvious how/when to use them, so it's nice to
| actually watch someone else use them effectively).
|
| I don't really have any particular place I could
| recommend to discuss this stuff, but I suppose
| /r/StableDiffusion/ on Reddit is decent.
| bavell wrote:
| Pretty good reddit community, lots of (N/SFW) models and
| content on CivitAI. Took me a weekend to get setup and
| generating images. I've been getting good results on my
| AMD 6750XT with A1111 (vladmandic's fork).
| Lerc wrote:
| What kind of(and how much) data did you use to train your
| checkpoint?
|
| I'd like to have a go at making one myself targeted towards
| single objects (be it car,spaceship, dinner plate, apple,
| octopus, etc). Most checkpoints are very heavily leaning
| towards people and portraits.
| og_kalu wrote:
| You're not going to get even close to Midjourney or even Bing
| quality on SD without finetuning. It's that simple. When you
| do finetune, it will be restricted to that aesthetic and you
| won't get the same prompt understanding or adherence.
|
| For all the promise of control and customization SD boasts,
| Midjourney beats it hands down in sheer quality. There's a
| reason like 99% of ai art comic creators stick to Midjourney
| despite the control handicap.
| SV_BubbleTime wrote:
| You load a model and have 6 sliders instead of one... it's
| not exactly "fine tuning".
|
| If you want the power, it's there. But nearly bone stock SD
| in auto1111 is going to get to any of these examples
| easily.
|
| Show me the civitai equivalent for MJ or Dalle2. It doesn't
| exist.
| og_kalu wrote:
| >You load a model and have 6 sliders instead of one...
| it's not exactly "fine tuning".
|
| Ok...? Read what i wrote carefully. Your 6 sliders won't
| produce better images than midjourney for your prompt on
| the base SD model.
| chankstein38 wrote:
| I feel like people shouldn't talk in definitives if their
| message is just going to demonstrate they have no idea what
| they're talking about.
| og_kalu wrote:
| I know what i'm talking about lol. I tuned a custom SD
| model that's downloaded thousands of times a month. I'm
| speaking from experience more than anything. Don't know
| why some SD users get so defensive.
| orbital-decay wrote:
| Yet you are posting this in a thread where GP provided
| actual examples of the opposite. Look for another comment
| above/below, there are MJ-generated samples which are
| comparable but also less coherent than the result from a
| much smaller SD model. And in case of MJ hallucinations
| cannot be fixed. MJ is good but it isn't magic, it just
| provides quick results with little experience required;
| prompt understanding is still poor, and will stay poor
| until it's paired with a good LLM.
|
| Neither of the existing models gives actually passable
| production-quality results, be it MJ or SD or whatever
| else. It will be quite some time until they get out of the
| uncanny valley.
|
| > There's a reason like 99% of ai art comic creators stick
| to Midjourney
|
| They aren't. MJ is mostly used by people without
| experience, think a journalist who needs a picture for an
| article. Which is great and it's what makes them good
| money.
|
| As a matter of fact (I work with artists), for all the
| surface-visible hate AI art gets in the artist community,
| many actual artists are using it more and more to automate
| certain mundane parts of their job to save time, and this
| is _not_ MJ or Dall-E.
| Miraste wrote:
| There's a distinction to be made here. Everything that
| makes SD a powerful tool is the result of being open
| source. The actual models are significantly worse than
| Midjourney. If an MJ level model had the tooling SD does
| it would produce far better results.
| bavell wrote:
| > If an MJ level model had the tooling SD does it would
| produce far better results
|
| And vice versa, which is the exciting part to me - only a
| matter of time!
| whywhywhywhy wrote:
| Midjourney output all has the same look to it.
|
| If you're ok with basic aesthetics it'll work but if you
| want something a bit less cringe or that will stand out
| in marketing it won't cut it.
| Miraste wrote:
| It only has the same look if it's not given any style
| keywords. I've been impressed with the output diversity
| once it's told what to do. It can handle a wide range of
| art styles.
| og_kalu wrote:
| >Yet you are posting this in a thread where GP provided
| actual examples of the opposite.
|
| Opposite of what ? OP posts results from a tuned model.
| orbital-decay wrote:
| Opposite of this:
|
| _> For all the promise of control and customization SD
| boasts, Midjourney beats it hands down in sheer quality._
|
| The results are comparable, but MJ in this comment
| https://news.ycombinator.com/item?id=36409043
| hallucinates more (look at the roofs in the second
| picture). And it cannot be fixed, maybe except for an
| upscale making it a bit more coherent. Until MJ obtains
| better tooling (which it might in the next iteration), it
| won't be as powerful. I'm not even starting on complex
| compositions, which it simply cannot do.
|
| _> OP posts results from a tuned model._
|
| Yes, which is the first step you should do with SD, as
| it's a much smaller and less capable model.
| troupo wrote:
| "Just" use a "properly tweaked" something.
| jfdi wrote:
| Nice! Would you mind sharing which stable diff you used / where
| you obtained from?
| kouteiheika wrote:
| I'm using my own custom trained model.
|
| Here, I've uploaded it to civitai:
| https://civitai.com/models/94176
|
| There are plenty of other good models too though.
| bavell wrote:
| Any tips or guides you followed on training your custom
| model? I've done a few LoRAs and TI but haven't gotten to
| my own models yet. Your results look great and I'd love a
| little insight into how you arrived there and what
| methods/tools you used.
| bluetidepro wrote:
| Do you have any good tutorial links to setup Stable Diffusion
| locally?
| mdorazio wrote:
| Since the author didn't have access to Midjourney, here's the
| first two prompts in MJ with default settings (not upscaled):
|
| https://imgur.com/a/siQG06O
|
| https://imgur.com/a/vp2oOHu
| muhammadusman wrote:
| thanks for sharing this, do you mind if I include this in the
| post. I will credit you of course (let me know what you'd like
| linked to).
|
| update: I've edited the post to include these results as well
| mdorazio wrote:
| Go for it! Happy to help. Let me know if you want upscales.
| gl-prod wrote:
| something something AI generated cannot be copyrighted [/s]
| poniko wrote:
| Midjurney is still so far ahead it's no competition. Did a lot of
| testing today and firefly generated so much errors with fingers
| and stuff, not seen that since the original stability release.
| Anyone know if the web firefly and the Photoshop version is the
| same model?
| jsheard wrote:
| It's worth noting the difference in how the training material
| is sourced though, Midjourney is using indiscriminate web
| scrapes while Firefly is taking the conservative approach of
| only using images that Adobe holds a license for. Midjourney
| has the Sword of Damocles hanging over its head that depending
| on how legal precedent shakes out, its output might end up
| being too tainted for commercial purposes, and Adobe is betting
| on being the safe alternative during the period of uncertainly
| and if the hammer does come down on web-scraping models.
| rafark wrote:
| Would mid-journey be liable though? I mean you can create
| copyrighted material using photoshop too. (Even paint!).
|
| If I create a Mickey Mouse using photoshop would adobe be
| liable for it?
| jsheard wrote:
| I don't think it really matters whether or not Midjourney
| themselves are liable, the output of their model being
| legally radioactive would break their business model either
| way. They make money by charging users for commercial-use
| rights to their generations, but a judgement that the
| generations are uncopyrightable or outright infringing on
| others copyright would make it effectively useless for the
| kinds of users who want commercial-use rights.
| ignite wrote:
| If midjourney could count fingers, I'd be thrilled!
| jrm4 wrote:
| I'm presuming you're not including Stable Diffusion when you
| say this; the fact that SD and its variants are defacto
| _extremely_ "free and open source" presently put it way ahead
| of anything else, and are likely to do so for some time.
| mettamage wrote:
| Not with typography though, haha. It can't spell. I had to draw
| the letters myself
| jamilton wrote:
| None of these can do text well. There's a model that does do
| text and composition well, but the name escapes me. And the
| general quality is much lower overall, so it's a pretty heavy
| tradeoff.
| kouteiheika wrote:
| DeepFloyd?
|
| https://github.com/deep-floyd/IF
| throwaway20222 wrote:
| I believe this is at least one solution, and one that the
| folks at stability themselves were pushing hard as a next
| step forward in the development of LLMs.
| [deleted]
| dahwolf wrote:
| I'm glad it's not just me getting unusable garbage out of Dall-E
| and glorious results from MidJourney.
| senko wrote:
| For comparison, these were generated using Stability.ai API:
| https://postimg.cc/gallery/MQfkgP7/ce388adf
|
| I used stable-diffusion-xl-beta-v2-2-2 model, copypasted prompts
| from the blog post, one-shot for each prompt. I chose style
| presets that closely matched the prompt (added as suffixes in
| image filenames).
| mdorazio wrote:
| Kind of strange to me that they didn't test any prompts with
| people in them. In my experience that tends to show the
| limitations of various models pretty quickly.
| usaar333 wrote:
| Lighting also tends to be pretty bad in complex scenes. I find
| the unrealistic shadows tends to break the photorealism of few
| light source scenes.
| [deleted]
| MediumD wrote:
| *Shameless Plug*
|
| If you want to play around with OpenJourney (or any other fine-
| tuned StableDiffusion model). I made my own UI with a free tier
| at https://happyaccidents.ai/.
|
| It supports all open-sourced fine-tuned models & loras and I
| recently added ControlNet.
| theobromananda wrote:
| All three of these are horrible, and running Stable Diffusion
| locally produces incredibly better results as seen in this
| comment section.
| fumar wrote:
| MidJourney produces more consistent and usable results. I am
| running SD and also pay for MJ. I've tried several checkpoint
| and loras, but the output is often disappointing or incorrectly
| using the prompts.
| throwaway742 wrote:
| My result for prompt 2 using Dreamshaper Stable Diffusion model.
|
| https://i.imgur.com/ipnf3f5.png
___________________________________________________________________
(page generated 2023-06-20 23:01 UTC)