hngopher.com

       [HN Gopher] FLUX.1-Krea and the Rise of Opinionated Models
       ___________________________________________________________________
        
       FLUX.1-Krea and the Rise of Opinionated Models
        
       Author : dbreunig
       Score  : 73 points
       Date   : 2025-08-04 22:14 UTC (4 days ago)
        
 (HTM) web link (www.dbreunig.com)
 (TXT) w3m dump (www.dbreunig.com)
        
       | TheSilva wrote:
       | All but the last example look better (to me) on Krea than
       | ChatGPT-4.1.
       | 
       | The problem with AI images, in my opinion, is not the generated
       | image (that can be better or worse) but the prompt and
       | instructions given to the AI and their "defaults".
       | 
       | So many blog posts and social media updates have that horrible
       | (again, to me) feel and look of overly plastic vibe, like a
       | cartoon that has been burn... just like "needs more JPEG" but
       | "needs more AI-vibe".
        
         | gchadwick wrote:
         | I'd argue the last one looks better as well, at least if you're
         | considering what looks more 'real'. The ChatGPT one looks like
         | it could have been a shot from a film, the Krea one looks like
         | a photo someone took off their phone of a person heading into a
         | car park on their way back from a party dressed as a super hero
         | (which I think far better fits the vibe of the original image).
        
           | TheSilva wrote:
           | My problem with the last one is that the person is not
           | walking directly into the door hence giving an unrealistic
           | vibe that the ChatGPT one does not have.
        
             | horsawlarway wrote:
             | Sure, it looks like he's walking toward the control panel
             | on the right of the door.
             | 
             | Personally - I think it looks considerably better than the
             | GPT image.
        
         | vunderba wrote:
         | Yeah I see that a lot. Blog usage of AI pics seem to fall into
         | two camps:
         | 
         | 1. The image just seems to be completely unrelated to the
         | actual content of the article
         | 
         | 2. The image looks like it came out of SD 1.5 with smeared
         | text, blur, etc.
        
       | resiros wrote:
       | I look forward for the day someone trains a model that can do
       | good writing, without emdashes, it's not but and all of the AI
       | slop.
        
         | astrange wrote:
         | You want a base model like text-davinci-001. Instruct models
         | have most of their creativity destroyed.
        
           | Gracana wrote:
           | How do you use the base model?
        
         | 1gn15 wrote:
         | Try one of the fine-tunes from https://allura.moe/. Or use an
         | autocomplete model. Mistral and Qwen have them.
        
       | MintsJohn wrote:
       | This is what finetuning has been all about since stable diffusion
       | 1.5 and especially SDXL. And even something StabilityAI base
       | models excelled at in the open weights category. (Midjourney has
       | always been the champion, but proprietary)
       | 
       | Sadly with SAI going effectively bankrupt things changed, their
       | rushed 3.0 model was broken beyond repair and the later 3.5 just
       | unfinished or something (the api version is remarkably better),
       | gens full of errors and artifacts even though the good ones
       | looked great. It turned out hard to finetune as well.
       | 
       | In the mean time flux got released, but that model can be fried
       | (as in one concept trained in) but not finetuned (this krea flux
       | is not based on the open weights flux). Add to that that as
       | models got bigger training/finetuning now costs an arm and a leg,
       | so here we are, a year after flux got released a good finetune is
       | celebrated as the next new thing :)
        
         | vunderba wrote:
         | Agreed. From the article:
         | 
         |  _> Model builders have been mostly focused on correctness, not
         | aesthetics. Researchers have been overly focused on the extra
         | fingers problem._
         | 
         | While that _might_ be true for the foundational models - the
         | author seems to be neglecting the tens of thousands of custom
         | LoRAs to customize the look of an image.
         | 
         |  _> Users fight the "AI Look" with heavy prompting and even
         | fine-tuning_
         | 
         | IMHO it is significantly easier to fix an _aesthetic_ issue
         | than an _adherence_ issue. You can take a poor quality image,
         | use ESRGAN upscalers, img2img using it as a ControlNet, run it
         | through a different model, add LoRAs, etc.
         | 
         | I have done some nominal tests with Krea but mostly around
         | adherence. I'd be curious to know if they've reduced the
         | omnipresent bokeh / shallow depth of field given that it is
         | Flux based.
        
           | dragonwriter wrote:
           | > Model builders have been mostly focused on correctness, not
           | aesthetics. Researchers have been overly focused on the extra
           | fingers problem.
           | 
           | > While that might be true for the foundational models
           | 
           | Its possibly true [0] of the models from the big public
           | general AI vendors (OpenAI, Google), its defintely not true
           | of MJ (which, if it has an aesthetic bias to what the article
           | describes as "the AI look" it is largely because that was a
           | popular actively sought and prompted for look in early AI
           | image gen to avoid the flatness bias of early models and MJ
           | leaned very hard into biasing toward what was popular
           | aesthetically in that and other areas as it developed. Heck,
           | lots of SD finetunes _actively sought_ to reproduce MJ
           | aesthetics for a while.)
           | 
           | [0] but I doubt it, and I think they have also been actively
           | targeting aesthetics as well as correctness, and the post
           | even hints at at least part of how that reinforced the "AI
           | look" -- the focus on aesthetics meant more reliance on the
           | LAION Aesthetics dataset to tune the models understanding of
           | what looked good, transferring the biases of that dataset
           | into models that were trying to focus on aesthetics.
        
             | vunderba wrote:
             | Definitely. It's been a while since I used midjourney, but
             | I imagine that style (and sheer speed) are probably the
             | last remaining use cases of MJ today.
        
         | dvrp wrote:
         | It is not just a fine-tune.
        
       | joshdavham wrote:
       | > Researchers have been overly focused on the extra fingers
       | problem
       | 
       | A funny consequence of this is that now it's really hard to get
       | models to intentionally generate disfigured hands (six fingers,
       | missing middle finger).
        
         | washadjeffmad wrote:
         | A casualty of how underbaked data labelling and training
         | are/were. The blindspots are glaring when you're looking for
         | them, but the decreased overhead of training LoRA now means we
         | can locally supplement a good base model on commodity hardware
         | in a matter of hours.
         | 
         | Also, there's a lot of "samehand" and hand hiding in BFL and
         | other models. Part of the reason I don't use any MaaS is how
         | hard they were focusing on manufacturing superficial
         | impressions over increasing fundamental understanding and
         | direction following. Kontext is a nice deviation, but it was
         | already achievable through captioning and model merges.
        
       | jrm4 wrote:
       | So, question -- does the author know that this post is merely
       | about "what is widely known about" vs. "what is actually
       | possible?"
       | 
       | Which is to say -- if one is in the business or activity of
       | "making AI images go a certain way" a quick perusal of e.g.
       | Civitai has about a million solutions to the "problem" of "all
       | the AI art looks the same?"
        
         | dbreunig wrote:
         | I'm aware of LoRA, Civitai, etc. I don't think they are "widely
         | known" beyond AI imagery enthusiasts.
         | 
         | Krea wrote a great post, trained the opinions in during post-
         | training (not during LoRA), and I've been noticing larger labs
         | doing similar things without discussing it (the default ChatGPT
         | comic strip is one example). So I figured I'd write it up for a
         | more general audience and ask if this is the direction we'll go
         | for qualitative tasks beyond imagery.
         | 
         | Plus, fine-tuning is called out in the post.
        
           | zamadatix wrote:
           | I don't think there is such a thing as a general audience for
           | AI imagery discussion yet, only enthusiasts. The closest
           | thing might be the subset of folks who saw ChatGPT can make
           | an anime version of their photo and tried it out or the large
           | amount of folks that have heard the artist's pushback about
           | the tools in general but not actually used them. They have no
           | clue about any of the nuances discussed in the article
           | though.
        
           | petralithic wrote:
           | AI imagery users are all enthusiasts, there aren't yet casual
           | users in a "wide" general capacity.
        
       | pwillia7 wrote:
       | Wan 2.2 is a video model people have been using to do text to
       | image recently that I think solves this problem way better than
       | Krea in the base model. --
       | https://www.reddit.com/r/comfyui/comments/1mf521w/wan_22_tex...
       | 
       | As others have said, you can fine-tune any model with a pretty
       | small data set of images and captions and make your generations
       | not look like 'AI' or all look the same.
       | 
       | Here's one I made a while back trained on Sony HVS HD video demos
       | from the 80s/90s --
       | https://civitai.com/models/896279/1990s-analog-hd-or-4k-sony...
        
         | mh- wrote:
         | o/t: your astrophotography LoRA is very cool, I came across it
         | before. thanks for making it!
         | 
         | (for others: https://civitai.com/models/890536/nasa-
         | astrophotography-or-f...)
        
         | dvrp wrote:
         | We've noticed that Wan 2.2 (available on Krea) + Krea 1
         | refinement yields _beautiful_ results. Check this from our
         | designer, for instance:
         | https://x.com/TitusTeatus/status/1952645026636554446
         | 
         | (Disclaimer: I am the Krea cofounder and this is based on a
         | small sample size of results I've seen).
        
           | mh- wrote:
           | _> prompts in alt_
           | 
           | First pic (blonde woman with eyes closed) has alt text that
           | begins:
           | 
           |  _> Extreme close-up portrait of a black man's face with his
           | eyes closed_
           | 
           | copypasta mistake or bad prompt adherence? haha.
        
         | petralithic wrote:
         | I don't know, those all still look like AI, as in, too clean.
        
       | dragonwriter wrote:
       | So, the one thing I notice is that in every trio of original
       | image, GPT-4.1 image, and Krea image where the author says
       | GPT-4.1 exhibits the AI look and Krea avoids it (except the first
       | with the cat), comparing the original inage to the Krea image
       | shows Krea retains all the described hallmarks of the AI look
       | that are present in the GPT image, but just toned down a little
       | bit (in the first, it lacks the obvious bokeh _because_ it avoids
       | showing anything at a much different distance than the main
       | subject, which is for that aesthetic issue what avoiding showing
       | hands is for dealing with the correctness issue of bad hands.)
        
         | demarq wrote:
         | > retains all the described hallmarks of the AI look that are
         | present in the GPT image, but just toned down a little bit
         | 
         | Not sure what you were expecting. That sounds like the model is
         | avoiding what it was built to avoid?
         | 
         | This model is not new tech just a change in bias.
         | 
         | It's doing what it says on the can.
        
       | cirrus3 wrote:
       | I did a lot of testing with Krea. The results were certainly very
       | different than flux-dev, less "ai-like" in some ways and the
       | details were way better, but very soft and bit washed out and
       | more ai-like in other ways.
       | 
       | I did a 50% mix of flux-dev-krea and flux-dev and it is my new
       | favorite base model.
        
       | dvrp wrote:
       | Hi there! Thank you for the glowing review! I'm the cofounder of
       | Krea and I'm glad you liked Sangwu's blog post. The team is
       | reading it.
       | 
       | You'll probably get a lot of replies around how this model is a
       | just a fine-tune and a potential disregard for LoRAs, as if we
       | didn't know about them. While the reality is that we have
       | thousands of them running in our platform. Sadly there's simply
       | so much a LoRA and a fine-tune can do before you run into issues
       | that can't be solved until you apply more advanced techniques
       | such as curated post-training runs (including reinforcement
       | learning-based techniques such as Diffusion-PPO[1]), or even
       | large-scale pre-training.
       | 
       | -
       | 
       | [1]: https://diffusion-ppo.github.io
        
       | dang wrote:
       | Recent and related:
       | 
       |  _Releasing weights for FLUX.1 Krea_ -
       | https://news.ycombinator.com/item?id=44745555 - July 2025 (107
       | comments)
        
       ___________________________________________________________________
       (page generated 2025-08-08 23:01 UTC)