hngopher.com

       [HN Gopher] Playing with DALL-E 2
       ___________________________________________________________________
        
       Playing with DALL-E 2
        
       Author : nahuel0x
       Score  : 46 points
       Date   : 2022-04-11 14:52 UTC (1 days ago)
        
 (HTM) web link (www.lesswrong.com)
 (TXT) w3m dump (www.lesswrong.com)
        
       | mensetmanusman wrote:
       | Shutter-stock will be dead, crazy.
        
       | bobbyi wrote:
       | What's the reason for the restriction on sharing images with
       | faces?
        
         | BlueTemplar wrote:
         | I assume the risk of giving the possibility of generating
         | "person of group X doing something bad" ?
         | 
         | Or more prosaically, we're MUCH better at noticing something
         | wrong in humans (especially faces !) than in, say, puppies, so
         | that would make DALL-E 2 look bad to an uninformed observer.
        
           | gwern wrote:
           | It's the PR risk.
           | 
           | You can easily pass the 'look good to an uniformed observer'
           | test with human faces. Remember, faces were something GANs
           | were doing near-flawlessly back as far as ProGAN all the way
           | back in the dark ages of late 2017. (Then StyleGAN did non-
           | photograph faces, like anime - see my ThisWaifuDoesNotExist
           | for a demo of that.) Doing them as part of a larger
           | composition, where the face is a small part of the images and
           | may vary much more than in closeup centered portraits, is
           | harder, but take a look at how Facebook's DALL-E rival Make-
           | A-Scene does it: https://arxiv.org/abs/2203.13131#facebook
           | They specially target faces as part of the training process,
           | with face-specific detectors/losses, and so the faces come
           | out great.
        
             | BlueTemplar wrote:
             | Fair enough, but it could be... both !
             | 
             | Since you know that training for photo-realistic faces is
             | going to take extra effort, _and_ that you 're not going to
             | allow them - you could just spare that effort ! (Or maybe
             | leave it for later, if the network is flexible enough ?)
        
       | globuous wrote:
       | This is just insane.
       | 
       | The image to the prompt "A dog looking curiously in the mirror,
       | in digital style" shows a cat looking into a mirror and seeing a
       | dog as itself! Although very creative and "objectively funny"
       | (may cat is like a dog!!!), I think the AI understood "A dog
       | looking curiously (is) in the mirror".
       | 
       | These poems are absolutely hilarious also, it's photo-realistic,
       | but the messages make no sense, though the fonts look so perfect!
       | That's where you know it's clearly AI generated ^^ For now....
       | Sometime soon it looks like realistic pictures of protests are
       | going to be so easy to generate with arbitrary text
        
         | BlueTemplar wrote:
         | Your comment is going to be misinterpreted without the context
         | :
         | 
         | DALL-E 2 failed all the "X looking curiously in the mirror, but
         | the reflection is Y" tests - showing X as the reflection
         | instead, Dave Orr had to do some "hinting/editing" to achieve
         | this :
         | 
         | > _Here 's one where I edited out the cat in the mirror and
         | changed the prompt to be about a dog, and it did something
         | sensible._
        
       | slig wrote:
       | Any ideas on how is this going to be priced? Will this model be
       | open just like the GPT-3?
        
       | cromwellian wrote:
       | "Attention is all you need" was published only 5 years ago in
       | 2017. BERT in 2018, GPT-3 in 2020. Now DALL-E, PaLM, LaMDA, etc.
       | The pace of AI progress is frighteningly fast. I really thin it's
       | plausible that by 2030, AI is writing and debugging most code,
       | and engineers basically become spec writers who manipulate
       | prompts, generate code, and verify. Essentially, project managers
       | whose "team" is a bunch of LLMs.
       | 
       | It really doesn't seem we've hit diminishing returns on LLM model
       | size yet, and that's not accounting for the multi-input types
       | where video, text, audio, and more are being fused together.
        
       | neoneye2 wrote:
       | Incredibly good. Wow
        
       | icey wrote:
       | The text this generates reminds me of the kind of text you'd see
       | in a dream. The images look real enough, and the text looks like
       | words; but totally unreadable. Kind of an uncomfortable feeling.
        
         | junga wrote:
         | I think I've never read text while dreaming.
        
           | nikonyrh wrote:
           | That, and looking at a clock is a great way to check whether
           | you are in a dream or not.
        
       | e2le wrote:
       | Very impressive. I'm left wondering whether DALL-E 2 could be
       | used to generate furry erotica, among other things.
        
       | 015UUZn8aEvW wrote:
       | I wonder how it would do with more technical engineering
       | drawings, like shop drawings for machinists or blueprints for a
       | building.
        
       ___________________________________________________________________
       (page generated 2022-04-12 23:00 UTC)