[HN Gopher] Tuning-Free Personalized Image Generation
       ___________________________________________________________________
        
       Tuning-Free Personalized Image Generation
        
       Author : LarsDu88
       Score  : 62 points
       Date   : 2024-07-25 15:34 UTC (7 hours ago)
        
 (HTM) web link (ai.meta.com)
 (TXT) w3m dump (ai.meta.com)
        
       | LarsDu88 wrote:
       | Up until recently, to insert yourself into an image generation
       | algorithm, you had to use a technique like Dreambooth, which
       | involves finetuning the model itself with a new mapping of the
       | subject to a rare token.
       | 
       | Meta just released and productionized a new technique that
       | doesn't require finetuning at all.
       | 
       | This enables a whole host of new possibilities... People can now
       | be inserted into scenes or outfits at will without any sort of
       | time consuming model training.
        
         | phkahler wrote:
         | This will be great for people on Instagram.
        
           | kkielhofner wrote:
           | Given how absurd Instagram/social media already is (entire
           | cottage industries of "private jet" stages in warehouses,
           | etc) it will arguably be a benefit for society when it
           | completely jumps the shark and anyone can generate over the
           | top ridiculousness in seconds.
        
         | GaggiX wrote:
         | >Up until recently, to insert yourself into an image generation
         | algorithm, you had to use a technique like Dreambooth
         | 
         | I mean, not really, you could just train a LoRA for example (it
         | doesn't require training with Dreambooth).
        
           | LarsDu88 wrote:
           | Well the point is, both LoRA and Dreambooth require fine
           | tuning the model (i.e. training)
        
       | fxtentacle wrote:
       | I know exactly why Facebook / Meta are researching this.
       | 
       | Just imagine the possibilities for advertisers: Instead of
       | telling someone how happy they would be if only they bought your
       | expensive car, let's just spam them with AI pictures of
       | themselves sitting in said expensive car, ideally next to some
       | very attractive other people that match their dating preferences.
       | 
       | Facebook has all the data they need to create very pleasant dream
       | scenarios for you. And they have the connections to monetize
       | those dreams. Didn't the Expanse have a scene with someone
       | addicted to living in a fantasy world? I thought it was meant as
       | a warning, but this wouldn't be the first time that an elaborate
       | warning would be misunderstood as an instruction manual.
        
         | LarsDu88 wrote:
         | This is a much bigger thing than the llama3.1 release. Llama
         | 3.1 doesn't really help Meta's bottom line.
         | 
         | But content creation and ads are Meta's killer app. By having a
         | model that doesn't require finetuning, they just changed the
         | whole game.
        
           | baq wrote:
           | Where do I sign up for my personalized AI content filter bot
           | which can reliably detect ads and remove them from my
           | browser?
        
             | fxtentacle wrote:
             | I think the future will be a web browser running inside a
             | VM and then the final DOM including all referenced
             | resources go through a filter before being rendered. That
             | way, it's impossible for the website to detect if you
             | display the ads or if you just just load all necessary
             | resources for rendering but mask them out.
        
               | acchow wrote:
               | The future will be "AI PCs" with a powerful on-device
               | chip that can filter out on-screen ads, but enabled only
               | by subscription.
        
             | LarsDu88 wrote:
             | Hmmm, that'd be an interesting startup idea!
             | 
             | How do ad blockers work exactly?
        
         | strongpigeon wrote:
         | This is quite thought provoking. I can totally see ads for, say
         | Disney World, where they put you in the picture instead of an
         | actor. I mean, the whole goal of these ads is already to have
         | you imagine yourself there. Putting you in the picture makes it
         | that much easier.
        
           | MasterScrat wrote:
           | We sell text-to-image model finetuning (aka "Dreambooth") as
           | a service and yes, this is one of the use cases.
           | 
           | Recently a travel agency used our platform to generate images
           | of people in the destinations they were advertising.
        
           | yazaddaruvala wrote:
           | lol, if it is a good enough Ad
           | 
           | Just add it to my Instagram timeline and I can skip the trip
           | and the cost. Everyone else (including me in 30 years) thinks
           | I went.
        
       | educasean wrote:
       | The future of Netflix isn't going to feature DiCaprio or Zendaya.
       | It will be you, your wife, and your friends on the screen as
       | hobbits adventuring to Mordor.
        
         | robterrell wrote:
         | Is this a common desire? I have absolutely no interest in
         | watching myself inserted into a film or TV show.
        
           | bugglebeetle wrote:
           | No, it's why the we invented the phrase "main character
           | syndrome" for people who exhibit this behavior.
        
           | add-sub-mul-div wrote:
           | It _sounds_ like it would be a common desire, like when you
           | see the futuristic computer interface in Minority Report. It
           | seems cool on the surface but falls apart the minute you
           | imagine the reality of using it in practice. Your arms would
           | get tired very quickly trying to control an interface in 3D
           | space.
           | 
           | The idea that we'd someday have no more shared experience
           | around media is harrowing and thankfully the public isn't
           | actually calling for it.
        
         | jsheard wrote:
         | This hypothetical future gets brought up a lot, but would the
         | novelty of something like that really hold up for more than one
         | or two viewings? There's nothing stopping you from replacing
         | the names in an eBook with the names of people you know
         | personally, but beyond young children I can't see anyone
         | actually being enamored by that.
        
         | buffington wrote:
         | While this may have some appeal, I think it'll be similar to
         | the fake Time magazine covers that made it look like someone
         | you knew was named Time's person of the year. Good for a
         | chuckle, but not much more.
         | 
         | I think applying the same idea to video games makes more sense,
         | especially given the autonomy you have in a video game, but
         | even then, the appeal wears off pretty quickly.
         | 
         | Games have had features that allow you to put your likeness in
         | the game before, and that feature probably isn't what people
         | we're buying the game for. Tony Hawk's Pro Skater 2 for
         | Dreamcast allowed you to map a photograph of your face to the
         | in game player. Odd example, but I actually just dusted off my
         | old Dreamcast and remembered this feature the other day as the
         | 20 year old game save had my 20 years younger face on the main
         | character. What I recall about that experience was that for
         | about 2 minutes it felt special, and then never thought about
         | it again until feeling confused about why I was in the game
         | before remembering.
        
         | hhh wrote:
         | I do not agree, and think that most people don't want this.
        
         | paxys wrote:
         | Why on earth would I want to go to the movies and watch myself?
        
         | taco_emoji wrote:
         | nobody wants that
        
         | coldfoundry wrote:
         | This is actually the premise of an episode of Black Mirror in
         | Season 6 called "Joan is Awful". Shows an interesting dark take
         | on the negatives that could potentially arise from this -
         | https://en.m.wikipedia.org/wiki/Joan_Is_Awful
        
       | tmsh wrote:
       | To be clear for folks this is "fine-tuning" ;) DreamBooth from
       | 2022: https://dreambooth.github.io.
       | 
       | Might want to update the HN title to reflect the paper title.
       | It's really just applying multiple techniques that have existed.
       | Paper's title is "Imagine yourself: Tuning-Free Personalized
       | Image." Nice paper though!
        
       | ChrisArchitect wrote:
       | Cleaner link: https://ai.meta.com/research/publications/imagine-
       | yourself-t...
        
         | miyuru wrote:
         | Thanks. original link will expire after some time. this will be
         | really helpful when that happens.
        
         | dang wrote:
         | Thanks! We changed to that from https://scontent-
         | sjc3-1.xx.fbcdn.net/v/t39.2365-6/452604312_... above.
        
           | LarsDu88 wrote:
           | OP here: Thanks for changing the title as well!
        
       | smokel wrote:
       | Photographic images generated by these systems tend to look like
       | the graffiti portraits you see on fairground attractions.
       | 
       | I've done a lot of photorealistic drawings, and the trick to make
       | something look real, is to get the tones exactly right. Misjudge
       | a tone a bit, and the result looks like a mediocre drawing or a
       | painting. In other words, the gradient of skin tones is off,
       | which is ironic, I guess.
       | 
       | I assume that there is a systemic error in (linearly?)
       | interpolating colors (in the wrong color space?) somewhere, which
       | potentially could be easy to fix and lead to improved
       | photorealism. On the other hand, it might be a horrible problem
       | to fix, because it would require accurate radiosity and
       | raytracing to get right.
        
         | jsheard wrote:
         | I know what you mean, my theory is that it's an emergent
         | property from RLHF tuning penalizing examples of bad/incoherent
         | lighting, which pushes the model towards that kind of vague
         | "lit from everywhere" style which is relatively easy to sell as
         | correct without a proper understanding of light transport. It
         | looks amateurish because that's the same trick an amateur human
         | might use to try to sell photorealism without good lighting
         | fundamentals.
        
         | GaggiX wrote:
         | The fact that these models rely heavily on classifier-free
         | guidance has a strong impact on the tones of the image.
        
         | TylerE wrote:
         | It doesn't help that RGB is very badly tuned for many skin
         | tones.
        
           | jsheard wrote:
           | Do these models really operate in RGB space? I would have
           | thought that using a perceptual color space to generate
           | images meant to be perceived by humans would be low hanging
           | fruit.
        
             | TylerE wrote:
             | At the very least they ultimately output to RGB. The
             | fleshtone part of the spectrum is quite small.
        
             | smokel wrote:
             | As a total cluebie on generative art, I would assume that
             | the neural networks involved use linear weights and ReLU
             | only. If the training data and the output are in RGB
             | pixels, then it would be reasonable to suppose that this
             | introduces some bias.
             | 
             | It may not be enough to use a perceptual color space only.
             | The gradients in skin tones, or any other complex texture,
             | are non-linear due to lighting and curvature.
             | 
             | Is there someone in the room who _does_ know how things
             | work, and whether this hypothesis is wrong or not?
        
       | paxys wrote:
       | They didn't "release" anything, it's a paper.
        
       | megaman821 wrote:
       | The last example in the paper with the boy and girl definitely
       | have faking a girlfriend vibes.
        
       ___________________________________________________________________
       (page generated 2024-07-25 23:08 UTC)