[HN Gopher] FLUX.1-Krea and the Rise of Opinionated Models
___________________________________________________________________
FLUX.1-Krea and the Rise of Opinionated Models
Author : dbreunig
Score : 73 points
Date : 2025-08-04 22:14 UTC (4 days ago)
(HTM) web link (www.dbreunig.com)
(TXT) w3m dump (www.dbreunig.com)
| TheSilva wrote:
| All but the last example look better (to me) on Krea than
| ChatGPT-4.1.
|
| The problem with AI images, in my opinion, is not the generated
| image (that can be better or worse) but the prompt and
| instructions given to the AI and their "defaults".
|
| So many blog posts and social media updates have that horrible
| (again, to me) feel and look of overly plastic vibe, like a
| cartoon that has been burn... just like "needs more JPEG" but
| "needs more AI-vibe".
| gchadwick wrote:
| I'd argue the last one looks better as well, at least if you're
| considering what looks more 'real'. The ChatGPT one looks like
| it could have been a shot from a film, the Krea one looks like
| a photo someone took off their phone of a person heading into a
| car park on their way back from a party dressed as a super hero
| (which I think far better fits the vibe of the original image).
| TheSilva wrote:
| My problem with the last one is that the person is not
| walking directly into the door hence giving an unrealistic
| vibe that the ChatGPT one does not have.
| horsawlarway wrote:
| Sure, it looks like he's walking toward the control panel
| on the right of the door.
|
| Personally - I think it looks considerably better than the
| GPT image.
| vunderba wrote:
| Yeah I see that a lot. Blog usage of AI pics seem to fall into
| two camps:
|
| 1. The image just seems to be completely unrelated to the
| actual content of the article
|
| 2. The image looks like it came out of SD 1.5 with smeared
| text, blur, etc.
| resiros wrote:
| I look forward for the day someone trains a model that can do
| good writing, without emdashes, it's not but and all of the AI
| slop.
| astrange wrote:
| You want a base model like text-davinci-001. Instruct models
| have most of their creativity destroyed.
| Gracana wrote:
| How do you use the base model?
| 1gn15 wrote:
| Try one of the fine-tunes from https://allura.moe/. Or use an
| autocomplete model. Mistral and Qwen have them.
| MintsJohn wrote:
| This is what finetuning has been all about since stable diffusion
| 1.5 and especially SDXL. And even something StabilityAI base
| models excelled at in the open weights category. (Midjourney has
| always been the champion, but proprietary)
|
| Sadly with SAI going effectively bankrupt things changed, their
| rushed 3.0 model was broken beyond repair and the later 3.5 just
| unfinished or something (the api version is remarkably better),
| gens full of errors and artifacts even though the good ones
| looked great. It turned out hard to finetune as well.
|
| In the mean time flux got released, but that model can be fried
| (as in one concept trained in) but not finetuned (this krea flux
| is not based on the open weights flux). Add to that that as
| models got bigger training/finetuning now costs an arm and a leg,
| so here we are, a year after flux got released a good finetune is
| celebrated as the next new thing :)
| vunderba wrote:
| Agreed. From the article:
|
| _> Model builders have been mostly focused on correctness, not
| aesthetics. Researchers have been overly focused on the extra
| fingers problem._
|
| While that _might_ be true for the foundational models - the
| author seems to be neglecting the tens of thousands of custom
| LoRAs to customize the look of an image.
|
| _> Users fight the "AI Look" with heavy prompting and even
| fine-tuning_
|
| IMHO it is significantly easier to fix an _aesthetic_ issue
| than an _adherence_ issue. You can take a poor quality image,
| use ESRGAN upscalers, img2img using it as a ControlNet, run it
| through a different model, add LoRAs, etc.
|
| I have done some nominal tests with Krea but mostly around
| adherence. I'd be curious to know if they've reduced the
| omnipresent bokeh / shallow depth of field given that it is
| Flux based.
| dragonwriter wrote:
| > Model builders have been mostly focused on correctness, not
| aesthetics. Researchers have been overly focused on the extra
| fingers problem.
|
| > While that might be true for the foundational models
|
| Its possibly true [0] of the models from the big public
| general AI vendors (OpenAI, Google), its defintely not true
| of MJ (which, if it has an aesthetic bias to what the article
| describes as "the AI look" it is largely because that was a
| popular actively sought and prompted for look in early AI
| image gen to avoid the flatness bias of early models and MJ
| leaned very hard into biasing toward what was popular
| aesthetically in that and other areas as it developed. Heck,
| lots of SD finetunes _actively sought_ to reproduce MJ
| aesthetics for a while.)
|
| [0] but I doubt it, and I think they have also been actively
| targeting aesthetics as well as correctness, and the post
| even hints at at least part of how that reinforced the "AI
| look" -- the focus on aesthetics meant more reliance on the
| LAION Aesthetics dataset to tune the models understanding of
| what looked good, transferring the biases of that dataset
| into models that were trying to focus on aesthetics.
| vunderba wrote:
| Definitely. It's been a while since I used midjourney, but
| I imagine that style (and sheer speed) are probably the
| last remaining use cases of MJ today.
| dvrp wrote:
| It is not just a fine-tune.
| joshdavham wrote:
| > Researchers have been overly focused on the extra fingers
| problem
|
| A funny consequence of this is that now it's really hard to get
| models to intentionally generate disfigured hands (six fingers,
| missing middle finger).
| washadjeffmad wrote:
| A casualty of how underbaked data labelling and training
| are/were. The blindspots are glaring when you're looking for
| them, but the decreased overhead of training LoRA now means we
| can locally supplement a good base model on commodity hardware
| in a matter of hours.
|
| Also, there's a lot of "samehand" and hand hiding in BFL and
| other models. Part of the reason I don't use any MaaS is how
| hard they were focusing on manufacturing superficial
| impressions over increasing fundamental understanding and
| direction following. Kontext is a nice deviation, but it was
| already achievable through captioning and model merges.
| jrm4 wrote:
| So, question -- does the author know that this post is merely
| about "what is widely known about" vs. "what is actually
| possible?"
|
| Which is to say -- if one is in the business or activity of
| "making AI images go a certain way" a quick perusal of e.g.
| Civitai has about a million solutions to the "problem" of "all
| the AI art looks the same?"
| dbreunig wrote:
| I'm aware of LoRA, Civitai, etc. I don't think they are "widely
| known" beyond AI imagery enthusiasts.
|
| Krea wrote a great post, trained the opinions in during post-
| training (not during LoRA), and I've been noticing larger labs
| doing similar things without discussing it (the default ChatGPT
| comic strip is one example). So I figured I'd write it up for a
| more general audience and ask if this is the direction we'll go
| for qualitative tasks beyond imagery.
|
| Plus, fine-tuning is called out in the post.
| zamadatix wrote:
| I don't think there is such a thing as a general audience for
| AI imagery discussion yet, only enthusiasts. The closest
| thing might be the subset of folks who saw ChatGPT can make
| an anime version of their photo and tried it out or the large
| amount of folks that have heard the artist's pushback about
| the tools in general but not actually used them. They have no
| clue about any of the nuances discussed in the article
| though.
| petralithic wrote:
| AI imagery users are all enthusiasts, there aren't yet casual
| users in a "wide" general capacity.
| pwillia7 wrote:
| Wan 2.2 is a video model people have been using to do text to
| image recently that I think solves this problem way better than
| Krea in the base model. --
| https://www.reddit.com/r/comfyui/comments/1mf521w/wan_22_tex...
|
| As others have said, you can fine-tune any model with a pretty
| small data set of images and captions and make your generations
| not look like 'AI' or all look the same.
|
| Here's one I made a while back trained on Sony HVS HD video demos
| from the 80s/90s --
| https://civitai.com/models/896279/1990s-analog-hd-or-4k-sony...
| mh- wrote:
| o/t: your astrophotography LoRA is very cool, I came across it
| before. thanks for making it!
|
| (for others: https://civitai.com/models/890536/nasa-
| astrophotography-or-f...)
| dvrp wrote:
| We've noticed that Wan 2.2 (available on Krea) + Krea 1
| refinement yields _beautiful_ results. Check this from our
| designer, for instance:
| https://x.com/TitusTeatus/status/1952645026636554446
|
| (Disclaimer: I am the Krea cofounder and this is based on a
| small sample size of results I've seen).
| mh- wrote:
| _> prompts in alt_
|
| First pic (blonde woman with eyes closed) has alt text that
| begins:
|
| _> Extreme close-up portrait of a black man's face with his
| eyes closed_
|
| copypasta mistake or bad prompt adherence? haha.
| petralithic wrote:
| I don't know, those all still look like AI, as in, too clean.
| dragonwriter wrote:
| So, the one thing I notice is that in every trio of original
| image, GPT-4.1 image, and Krea image where the author says
| GPT-4.1 exhibits the AI look and Krea avoids it (except the first
| with the cat), comparing the original inage to the Krea image
| shows Krea retains all the described hallmarks of the AI look
| that are present in the GPT image, but just toned down a little
| bit (in the first, it lacks the obvious bokeh _because_ it avoids
| showing anything at a much different distance than the main
| subject, which is for that aesthetic issue what avoiding showing
| hands is for dealing with the correctness issue of bad hands.)
| demarq wrote:
| > retains all the described hallmarks of the AI look that are
| present in the GPT image, but just toned down a little bit
|
| Not sure what you were expecting. That sounds like the model is
| avoiding what it was built to avoid?
|
| This model is not new tech just a change in bias.
|
| It's doing what it says on the can.
| cirrus3 wrote:
| I did a lot of testing with Krea. The results were certainly very
| different than flux-dev, less "ai-like" in some ways and the
| details were way better, but very soft and bit washed out and
| more ai-like in other ways.
|
| I did a 50% mix of flux-dev-krea and flux-dev and it is my new
| favorite base model.
| dvrp wrote:
| Hi there! Thank you for the glowing review! I'm the cofounder of
| Krea and I'm glad you liked Sangwu's blog post. The team is
| reading it.
|
| You'll probably get a lot of replies around how this model is a
| just a fine-tune and a potential disregard for LoRAs, as if we
| didn't know about them. While the reality is that we have
| thousands of them running in our platform. Sadly there's simply
| so much a LoRA and a fine-tune can do before you run into issues
| that can't be solved until you apply more advanced techniques
| such as curated post-training runs (including reinforcement
| learning-based techniques such as Diffusion-PPO[1]), or even
| large-scale pre-training.
|
| -
|
| [1]: https://diffusion-ppo.github.io
| dang wrote:
| Recent and related:
|
| _Releasing weights for FLUX.1 Krea_ -
| https://news.ycombinator.com/item?id=44745555 - July 2025 (107
| comments)
___________________________________________________________________
(page generated 2025-08-08 23:01 UTC)