[HN Gopher] Spent $15 in DALL*E 2 credits creating this AI image
___________________________________________________________________
Spent $15 in DALL*E 2 credits creating this AI image
Author : pat-jay
Score : 283 points
Date : 2022-08-11 16:53 UTC (6 hours ago)
(HTM) web link (pub.towardsai.net)
(TXT) w3m dump (pub.towardsai.net)
| fnordpiglet wrote:
| If you think it's hard to get an AI to render what's in your
| mind, try another human artist. Specifying something visually
| complex with an assumption that it'll be precisely what you're
| imagining is shockingly hard. I'm not surprised prompt creation
| is so complex. At least with the AI bots the turn around time for
| iteration is tight. That said humans likely iterate fewer times,
| but each iteration takes a long time.
| anigbrowl wrote:
| Can't wait for 'Tell HN: how I make mid six figures as a prompt
| engineer'.
| Nition wrote:
| Absolutely. See also: https://promptbase.com
|
| And we're still in the early days.
| anigbrowl wrote:
| WTAF
|
| Unwillingly considering whether the easy bucks are worth the
| greasy feeling.
| Workaccount2 wrote:
| "We let our graphic designer go so we could onboard a AI Prompt
| Engineer"
|
| "How much are we paying him?"
|
| "About $225k plus bonus and equity"
|
| "And how much was the graphic designer paid?
|
| "$55k"
|
| "..."
| rfrey wrote:
| It's the graphic design industry's own fault for not
| gradually renaming themselves as Pixel Intensity Engineers.
| _pastel wrote:
| If you're interested in browsing creative prompts, I highly
| recommend the reddit community at r/dalle2.
|
| Some are impressive: -
| www.reddit.com/r/dalle2/comments/uzosy1/the_rest_of_mona_lisa
| - www.reddit.com/r/dalle2/comments/vstuns/super_mario_getting_his
| _citizenship_at_ellis
|
| And others are hilarious: - www.reddit.com/r/dall
| e2/comments/v0pjfr/a_photograph_of_a_street_sign_that_warns_drive
| rs -
| www.reddit.com/r/dalle2/comments/wbbkbb/healthy_food_at_mcdonalds
| - www.reddit.com/r/dalle2/comments/wlfpax/the_elements_of_fire_wa
| ter_earth_and_air_digital
| Nition wrote:
| Clickable links for the lazy (it seems that the http:// is
| required to make it work):
|
| http://www.reddit.com/r/dalle2/comments/uzosy1/the_rest_of_m...
|
| http://www.reddit.com/r/dalle2/comments/vstuns/super_mario_g...
|
| http://www.reddit.com/r/dalle2/comments/v0pjfr/a_photograph_...
|
| http://www.reddit.com/r/dalle2/comments/wbbkbb/healthy_food_...
|
| http://www.reddit.com/r/dalle2/comments/wlfpax/the_elements_...
| jeffchien wrote:
| /r/weirddalle is also great for some inspiration, though most
| of the entries are memes generated by Dall-e Mini/Craiyon. I
| often find art styles and modifiers that I never considered,
| like "Byzantine mosaic" or "Kurzgesagt video thumbnail".
|
| https://www.reddit.com/r/weirddalle/top/?sort=top&t=all
| mFixman wrote:
| My favourite one is Kermit the Frog in the style of different
| movies.
|
| https://www.reddit.com/r/dalle2/comments/v1sc2z/kermit_the_f...
| hombre_fatal wrote:
| Love the stylistic ones. Amazing how it generates such good anime
| and vaporwave variants, like the neon vaporwave backboard.
|
| I ran out of credits way too fast, so I like to see other people
| playing with it and their iterative process.
| pigtailgirl wrote:
| -- spent a day with DALL-E - here are some of my favorites:
| https://imgur.com/a/uD5yjV3 --
| planetsprite wrote:
| Were a lot of your prompts just "attractive girl hat and
| sunglasses high quality photography"
| pigtailgirl wrote:
| -- hat pic are playing with "variations" mode - the prompt
| was: "portrait photo, california beach with female model
| wearing hat and sunglasses, studio, lens flare, colourful,
| 4k, high definition, 35mm, HD" --
| prashp wrote:
| You like your lobsters
| pigtailgirl wrote:
| -- they're the little lobsters we have over here (akazae)! -
| quite expensive - _very_ good =) -
| https://en.wikipedia.org/wiki/Metanephrops_japonicus --
| tough wrote:
| They reminded me of this little guys we have in the med
| https://en.wikipedia.org/wiki/Nephrops_norvegicus
| krisoft wrote:
| > it was difficult to find images where the entire llama fit
| within the frame
|
| I had the same trouble. In my experiment I wanted to generate a
| Porco Rosso style seaplane. illustration. Sadly none of the
| generated pictured had the whole of the airplane in them. The
| wingtips or the tail always got left off.
|
| I found this method to be a reliable workaround: I have
| downloaded the image I liked the most. Used an image editing
| software to extend the image in the direction I wanted it to be
| extended and filled the new area with a solid colour. Cropped a
| 1024x1024 size rectangle such that it had about 40% generated
| image, and 60% solid colour. Uploaded the new image and asked
| DALL-E to infill the solid area while leaving the previously
| generated area unchanged. Selected from the generated extensions
| the one I liked the best, downloaded it and merged it with the
| rest of the picture. Repeated the process as required.
|
| You need a generous amount of overlap so the network can figure
| out which parts is already there and how best to fit the rest.
| It's a good idea to look at the image segment you need to be
| infilled. If you as a human can't figure out what it is you are
| seeing, then the machine won't be able to figure it out either.
| It will generate something, but it will look out of context once
| merged.
|
| The other trick I found: I wanted to make my picture a canvas
| print, and thus I needed a higher resolution image. Higher even
| then what I can reasonably hope with the above extension trick.
| What I did is that I have upscaled the image (used bigjpg.com,
| but there might be better solutions out there.) After that I had
| a big image, but of course there weren't many small scale details
| now on it. So I have sliced it up to 1024x1024 rectangles,
| uploaded the rectangles to DALL-E and asked it to keep the
| borders intact but redraw the interior of them. This second trick
| worked particularly well on an area of the picture which shown a
| city under the airplane. It has added nice small details like
| windows and doors and roofs with texture without disturbing the
| overall composition.
|
| What I did:
| devin wrote:
| MidJourney allows you to specify other aspect ratios. DALL-E's
| square constraint makes a lot of things more difficult than
| they need to be IMO.
| GaggiX wrote:
| Also with Stable Diffusion. It's a really cool feature to
| have and playing around.
| bredren wrote:
| I had similar problems trying to get the whole of a police car
| overgrown with weeds.
|
| https://imgur.com/a/U5Hl2gO
|
| I was testing to see how close I could get to replicating a
| t-shirt graphic concept I saw.
|
| I had been using ~"A telephoto shot of A neglected police car
| from the 1980s Viewed from a 3/4 angle sits in the distance.
| The entire vehicle is visible but it is overgrown with grass
| and flowery vines"
|
| This process sounds great, though it seems like DALLE needs to
| offer tools to do this automagically.
| Miraste wrote:
| What prompts did you use for the infill and detail generation?
| krisoft wrote:
| Good question! All of them had the same postfix ", studio
| ghibli, Hayao Miyazaki, in the style of Porco Rosso,
| steampunk". I used this for all the generations in the hopes
| of anchoring the style.
|
| With the prefix of the prompt I described the image. I
| started the extension operations with "red seaplane over
| fantasy mediterranean city" but then I quickly realised that
| this was making the network generate floating cities in the
| sky for me. :D So then I varied the prompt. "red seaplane on
| blue sky" in the upper regions and "fantasy mediterranean
| city" in the lower ones.
|
| I went even more specific and used "mediterranean sea port,
| stone bridge with arches" prefix for a particular detail
| where I wanted to retain the bridge (which I liked) but
| improve on the arches. (which looked quite dingy)
|
| (I have just counted and it seems I have used 27 generations
| for this one project.)
| fragmede wrote:
| > I quickly realised that this was making the network
| generate floating cities in the sky for me
|
| Maybe Dalle-2 is just secretly a studio Ghibli/Miyazaki
| movie fan.
| andreyk wrote:
| Wow, I've had the same trouble and these are some great tips!
| Thanks for sharing
| krisoft wrote:
| Anytime! I have uploaded the image in question: the initial
| prompt with first generated images, the extended raw image,
| and then the one with the added details on the city.
|
| https://imgur.com/a/QEU7EJ2
| mdorazio wrote:
| This is a fantastic end result. Thanks for sharing your
| process to get there.
| [deleted]
| keepquestioning wrote:
| DALL-E is truly magic. It got me believing we are close to AGI.
|
| I wonder what Gary Marcus or Filip Pieknewski think about it.
| Surely they must be eating crow.
| outworlder wrote:
| > It got me believing we are close to AGI.
|
| We are not. But maybe we are closer to replicating some of our
| internal brain workings.
| dougmwne wrote:
| Yesterday I saw one of Gandalf eating samples at Costco. I was
| laughing hysterically for a minute. AI is not supposed to have
| a sense of humor. That was supposed to be the last province of
| the human, but it is quite awhile since a human made me laugh
| like that.
| outworlder wrote:
| I don't think intelligence requires humor. It could be just a
| quirk of our brains.
| WoodenChair wrote:
| > AI is not supposed to have a sense of humor.
|
| And this AI doesn't. Your anecdote is totally unrelated to
| the idea of AGI in the gp post. The fact that it made you
| laugh is a happenstance. It was not "trying" to make you
| laugh.
| dougmwne wrote:
| It's only unrelated if there's no proto-AGI going on. Many
| images give me a moment of doubt, even though I absolutely
| know that I'm looking at nothing more than the output of a
| pile of model weights, says I the pile of neurons.
| Comevius wrote:
| If I write a Python script that cuts together a bunch of
| pictures and the output makes you laught the script hardly
| deserves all the credit. It's us humans that create meaning.
| kube-system wrote:
| It's funny in the way that mad libs are funny. It's
| unexpected. The _reason_ it is unexpected is because the
| computer is dumb, not because it is smart.
| dougmwne wrote:
| I think the humor came from the vibe, humiliation,
| dejection. Like seeing a beloved math teacher caught in an
| adult video store.
|
| I also saw this one recently from Midjourney. Would not
| call the humor random.
|
| https://www.reddit.com/r/midjourney/comments/w73rhv/prompt_
| t...
| NateEag wrote:
| What was the prompt for that image?
|
| What wrote the prompt?
| dougmwne wrote:
| But the prompt was not funny, only the image.
| LegitShady wrote:
| I saw that on reddit. The face was horrific and not at all
| human like. It didn't have a sense of humour - it just took a
| prompt and mashed some things together, but the prompt was
| funny and the image was horrifying. Not even uncanny valley
| shit, but "Gandalf was in a bad motorcycle and will never
| look like a human again" bad.
|
| It's still up on the dalle2 subreddit.
| jmfldn wrote:
| This tells us little about AGI. It might seem like it does but
| this is an incredibly narrow specific set of technologies. They
| work together to produce some startling results (with many
| limitations) but this is just another narrow application.
|
| I suspect AGI, depending on how its defined, will be with us in
| some form in the next few decades at most. Just a hunch. This
| is nothing to do with that mission though imho. Maybe you can
| read into it something like, "we are solving lots of discrete
| problems like this, maybe we can somehow glue them together
| into a higher level program"? That might give you something AI-
| esque? My guess is that 'true' AGI will have an elegant
| solution rather than a big bag of stuff glued together.
| thfuran wrote:
| We're pretty much just a big bag of stuff glued together.
| croes wrote:
| When I see some of the bad pictures it produces I think we are
| nowhere near AGI
| outworlder wrote:
| Most people would draw even worse pictures given the same
| prompts.
| donkarma wrote:
| most neural networks would draw even worse pictures given
| the same prompts
| Comevius wrote:
| Machine learning just glues together existing things, which is
| how art is created. As amusing these pictures are, it's us
| humans who bring meaning to them, both when producing what
| these algorithms use as input and when consuming their output.
| We are the actual magic behind DALL-E.
|
| An AGI wouldn't need us to this extent, or at all. An AGI would
| also be able to come up with new ways to represent ideas, even
| ways that are foreign to us.
| sebringj wrote:
| The images remind me of one of my dreams where logic and
| reasoning are thrown out and the pure gist of the thing is taken.
| I wonder if it is because it is built with vector operations and
| calculus to determine the closest match or fuzzy matches for
| essentially everything it eventually determines sans cognition,
| things would tend to be more fuzzy or quasi-close but not quite
| there. Very entertaining post.
|
| I have my own api key as well but not with DALL-E 2 access just
| yet but seems similar in terms of prompting text in stages to get
| what you want. It feels kind of like negotiating with it in some
| way.
| outworlder wrote:
| > The images remind me of one of my dreams (...)
|
| A lot of dreams scenery seems to throw logic and reasoning out
| of the window. Even small sensory inputs can make a huge
| difference to a dream sequence. And in many case they don't
| make sense even in the context of the dream.
|
| I haven't personally experienced any hallucinations myself, but
| some DALL-E images seem awfully familiar to what some people
| describe.
|
| I know that comparisons between brains and machine learning
| (including neural networks) are superficial at best, but I
| still wonder if DALL-E is mimicking, in its own way, a portion
| of our larger brain processing 'pipeline'.
| sebringj wrote:
| Spot on, like the more basic part of a raw dream feed without
| rhyme or reason. Maybe even laying the groundwork for an
| experience architecture's input when that day finally comes,
| who knows.
| antoniuschan99 wrote:
| first thing I noticed was that it had no distinct features of a
| basketball. looks more like a bowling ball with the swirly
| things on it. Kind of adds to your dream thought.
| outworlder wrote:
| Human dream sequences often have problems with faces, text
| and mirrors. You can train yourself to try to focus on these
| features when dreaming.
|
| Most people in our dreams don't even have faces that we would
| recognize. When they do have faces, sometimes it is not even
| the right face.
| humbleferret wrote:
| "In working with DALL*E 2, it's important to be specific about
| what you want without over-stuffing or adding redundant words."
|
| I found this to be the most important point from this piece.
| Often people don't really know what they really want when it
| comes to creative work, let alone to some omniscient algorithm.
| In spite of that, it's a delight to see something you love from
| an unspecific prompt that you won't find with anything you
| receive from a human.
|
| Dall.E 2 never ceases to amaze me.
|
| For anyone interested in learning about what Dall.E 2 can do, the
| author also links to the Dall.E 2 prompt book (discussed in this
| post https://news.ycombinator.com/item?id=32322329).
| JadoJodo wrote:
| I tried a number of these generators a week ago (or so), all with
| the same prompt: "A child looking longingly at a lollipop on the
| top shelf" with pretty abysmal (and sometimes horrifying)
| results. I'm not sure if my expectations are too high, but maybe
| I was doing it wrong?
| Marazan wrote:
| Dalle(and others) are great, almost magical, at specific types
| of images and abysmal at others.
| foobarbecue wrote:
| It's fascinating to me that in the first image, the llama's
| jersey has a drawing of a llama on it. I wonder if that was in
| the prompt?
| conception wrote:
| https://pitch.com/v/DALL-E-prompt-book-v1-tmd33y
|
| The DALL-E 2 prompt book. If anything, pretty neat look at how
| the various prompts come out and some of the art created by it.
| Vox_Leone wrote:
| Can I use NLP to generate input for DALL-E 2? That would be cool.
| MonkeyMalarky wrote:
| I want to see a few iterations of describing an image with AI,
| generating it, describing it again, generating it... Like when
| passing a piece of text through Google translate back and
| forth.
| pamelafox wrote:
| I tried that! Results were mixed:
| https://twitter.com/pamelafox/status/1542593090472386561
|
| It needs a better text to image model, I think. Maybe you can
| fork it and improve?
| MonkeyMalarky wrote:
| Interesting! I really like the flute > cup > bathtub
| sequence. It has a real dreamlike disjointedness to it.
| turdnagel wrote:
| There was a tool that could find the "equilibrium" called
| Translation Party. I don't think it works anymore. I'd love
| to see one that goes back and forth between DALL-E and an
| image description algorithm.
| rmbyrro wrote:
| According to internet popular belief, you'd end up with a
| picture of a certain ignominious dictator that unfortunately
| destroyed Europe in the 1940's. [1]
|
| [1] https://en.wikipedia.org/wiki/Godwin%27s_law
| minimaxir wrote:
| You can, in fact, use GPT-3 to engineer prompts for DALL-E 2 in
| a sense.
|
| https://twitter.com/simonw/status/1555626060384911360
| jcims wrote:
| I used GPT-3 to 'write' a children's book and asked it to
| include descriptions of the illustrations.
|
| https://docs.google.com/presentation/d/1y8EE_p8bw9dIEDguT1bT...
|
| The fact that it's a derivative of an existing work is
| noteworthy, but I gave it absolutely no guidance on the topic.
| If i suggest something it will give it a go with similar
| fervor. eg https://imgur.com/a/N1qWaSV
| jfk13 wrote:
| Your link doesn't seem to be publicly accessible.
| falcor84 wrote:
| >the ball is positioned in such a way that the llama has no real
| hope of making the shot
|
| I love that we're at the level where the physical "realism" of
| correctly representing quadrupedals playing basketball is a thing
| now. I suppose the next level AI will be expected to model a full
| 3d environment with physical assumptions based on the prompt and
| then run the simulation
| TheOtherHobbes wrote:
| That's the only way to get reliably usable output.
|
| There's a lot of "80% there but not quite" in the current
| version, which makes it more of a novelty than a useful content
| generator.
|
| The problem with moving to 3D is there are no almost no 3D data
| sources that combine textures, poses (where relevant),
| lighting, 3D geometry and (ideally) physics.
|
| They can be inferred to some extent from 2D sources. But not
| reliably.
|
| Humans operate effortlessly in 3D and creative humans have no
| issues with using 3D perceptions creatively.
|
| But as for as most content is concerned it's a 2D world. Which
| is why AI art bots know the texture of everything and the
| geometry of nothing.
|
| AI generation is going to be stuck at nearly-but-not-quite
| until that changes.
| namrog84 wrote:
| While not fully. There is a lot of freely available 3d models
| that can used as a starting point. Id love a dalle2 for 3d
| model generation. Even if no texture lighting physics was
| there.
| Karawebnetwork wrote:
| I was curious to compare results with Craiyon.ai
|
| Here is "llama in a jersey dunking a basketball like Michael
| Jordan, shot from below, tilted frame, 35deg, Dutch angle,
| extreme long shot, high detail, dramatic backlighting, epic,
| digital art": https://imgur.com/a/7LoAtRx
|
| Here is "Llama in a jersey dunking a basketball like Michael
| Jordan, screenshots from the Miyazaki anime movie", much worst:
| https://imgur.com/a/g99G7Bn
| speedgoose wrote:
| Craiyon did step up a lot in its understanding recently. The
| image quality is still not the best but it if you ignore the
| blurriness, the scary faces, and the weird shapes, it can
| sometimes be better than dall.e.
| samspenc wrote:
| Fascinating, are there any other similar products in this same
| category as DALL.E and Craiyon?
| peab wrote:
| wombo.ai and midjourney
| jiggywiggy wrote:
| Wow the blogs posted here are awesome, the octopus and this lama
| are awesome.
|
| Myself cant seem to get it to work. I think it's not very good at
| real things. Tried fitness related images, all is weird. Probably
| with fantasy kinda stuff its better since it has to be less
| accurate.
| EMIRELADERO wrote:
| I wonder how this would play out with the new Stable Diffusion
| vanadium1st wrote:
| I've tried out a couple of prompts from the post in Stable
| Diffusion and as expected the results were much weaker. It has
| drawn some alpacas and basketballs with little relation between
| the objects.
|
| I've been playing with Stable Diffusion a lot, and in my
| experience its results are much weaker then what's shown in
| this post. The artistic pictures that it generates are
| beautiful, often more beautiful then Dalle-2 ones. But it has a
| real problem understanding the basic concepts of anything that
| is not the simplest task like "draw a character in this or that
| style". And explaining the situations in detail doesn't help -
| the AI just stumbles upon basic requests.
|
| Seems like Stable Diffusion has a much more shallow
| understanding of what it draws and can only produce good result
| for things very similar to the images it learned from. For
| example, it could generate really good dutch still life
| paintings for me - with fruits, bottles and all the regular
| expected objects for this genre of painting. But when I've
| asked it to add some unusual objects to the painting (like a
| Nintendo switch, or a laptop) - it couldn't grasp this concept
| and just added more warbled fruit. Even though the system
| definitely knows how a Switch looks like.
|
| The results in the post are much more impressive. I doubt that
| Dalle-2 saw a lot of similar images in training, but in all of
| the styles and examples it definitely understood how a llama
| would interact with a basketball, what are their relative sizes
| and stuff like that. On surface results from different engines
| might look similar, but to me this is an enormous difference in
| quality and sophistication.
| GaggiX wrote:
| Stable Diffusion has a smaller text encoder than Dalle 2 and
| other models (Imagen, Parti, Craiyon) so that it can fit into
| consumer GPUs. I believe StabilityAI will train models based
| on a larger text encoder, the text encoder is frozen and does
| not require training, so scaling the text encoder is quite
| free. For now this is the biggest bottleneck with Stable
| Diffusion, the generator is really good and the image quality
| alone is incredible (managing to outperform Dalle 2 most of
| the time).
| netfortius wrote:
| How could all this play into "flooding" the NFT markets?
| dymk wrote:
| It's hard to flood the NFT market any further. It was almost
| all autogenerated art before DALL-E was publicly available.
| pwython wrote:
| They're already using DALL-E for that 2021 fad.
|
| I'm more curious of how this will effect stock photography.
| Soon anyone can generate the exact image they're looking for,
| no matter how obscure.
| LegitShady wrote:
| NFTs are just numbers on a blockchain. The picture is a canard.
| In the US I don't think you can copyright DALL-E images as they
| aren't created by a human, so you spend money to make them and
| anyone else can use them.
| renewiltord wrote:
| This is really good fun, actually. Spent some time fucking around
| with it and it can make some impressive photorealistic stuff like
| "hoverbus in san francisco by the ferry building, digital photo".
|
| I mostly use it and Midjourney for material for my DnD campaign,
| but I'm going to need to do a little more work to make the whole
| thing coherent. Only tried it once and it was okay.
|
| The interesting part is that it can do things like "female ice
| giant" reasonably whereas google will just give you sexy bikini
| ice giant for stuff like that which is not the vibe of my
| campaign!
| BashiBazouk wrote:
| Is there randomization or will the same prompts produce the same
| image sets?
| minimaxir wrote:
| Always random. (in theory a seed is possible but not offered)
| croes wrote:
| So the services that sell Dall-E 2 prompts are useless
| minimaxir wrote:
| There's _some_ stability offered by specific prompts
| though.
| Taylor_OD wrote:
| I love this.
| f0e4c2f7 wrote:
| I recently made PromptWiki[0] to try to document useful prompts
| and examples.
|
| I think we're at the beginning of exploring what these image
| models can do and what the best ways to work with them are.
|
| [0] https://promptwiki.com
| aj7 wrote:
| I tried "machining a Siamese cat on the lathe" but with
| disappointing results.
| kayfhf wrote:
| simias wrote:
| I'm usually very much a skeptic when it comes to "revolutionary"
| tech. I think the blockchain is crap. I think fully self-driving
| cars are still a long way away. I think that VR and the metaverse
| are going to remain gimmicks in the foreseeable future.
|
| But this DALL-E thing, it's really blowing my mind. That and deep
| fakes, now that's sci-fi tech. It's both exciting and a bit
| scary.
|
| The idea that in the not so far future one will be able to create
| images (and I presume later, audio and video) of basically
| anything with just a simple text prompt is rife with potential
| (both good and bad). It's going to change the way we look at art,
| it's also going to give incredibly powerful creative tools to the
| masses.
|
| For me the endgame would be an AI sufficiently advanced that one
| could prompt "make an episode of Seinfeld that centers around
| deep fakes" and you'd get an episode virtually indistinguishable
| from a real one. Home-made, tailor-made entertainment.
| Terrifyingly amazing. See you in a few decades...
| obloid wrote:
| "Image intentionally modified to blur and hide faces"
|
| I thought this was strange. Why hide an AI generated face?
| ticviking wrote:
| They're being used to create fake profile pictures.
| kube-system wrote:
| I'm not sure why anyone bothers. StyleGAN2 profile photos are
| literally all over social media and they're good enough to
| fool the human reviewers every time I report them.
| vbezhenar wrote:
| Is it hard to reimplement that algorithm? I want to see what
| people would do with porn-enabled image generator. Hopefully
| pornhub already hiring data scientists.
| kristiandupont wrote:
| I picture in a few years we will be playing around with a code
| generation tool, and people will be drawing similar conclusions.
| "You have to be really specific about what you like. If you just
| say 'chat tool', it will allow you to chat to one other person
| only."
| tambourine_man wrote:
| > It's important to tell DALL*E 2 exactly what you want
|
| That's not as easy as it sounds. Specially in the surreal cases
| that DALL-E is usually requested.
|
| Sometimes you don't know what you want until you see it. Other
| times you do, but are not able to express in ways that the
| computer can understand.
|
| I see being able to communicate efficiently with the machine as a
| future in demand skill
| upupandup wrote:
| I asked DALL-E for 'bottomless naked women' and I was banned.
| bpye wrote:
| I suspect this is a joke, but I did find that it was a little
| overzealous with the filtering. I was trying to get someone
| (not a specific person) shouting or with an angry expression,
| and a few prompts I came up with were blocked. Not banned
| though.
| astrange wrote:
| I kept getting a scene with "two people holding hands"
| blocked, it allowed "two people kissing" and then when I
| tried "and wife" instead of "two people" it banned me.
| (They unbanned me when I emailed them though.)
|
| Oddly, the ones it blocked were more sfw than several
| others it allowed, but of course I don't know what the
| outputs would've been...
| mattwad wrote:
| At least 10% of web dev today is being good at search prompts
| for Google. (And that's not necessarily a bad thing, it's just
| about finding the right tool or pattern for your specific
| problem)
| tambourine_man wrote:
| Oh yeah. Knowing the keywords is what makes you an expert
| neonate wrote:
| https://archive.ph/RwY42
| sgtFloyd wrote:
| My two cents: the techniques OP uses are absolutely valid, but
| I've found much more success "sampling" styles and poses from
| existing works.
|
| Rather than trying to perfectly describe my image, I like to use
| references where the source material has what you want. With
| minimal direction these prompts get impressively close:
|
| "larry bird as a llama, dramatic basketball dunk in a bright
| arena, low angle action shot, from the movie Madagascar (2005)"
| https://labs.openai.com/s/wxbIbXa0HRwwGUqQaKSLtzmR
|
| "Michael Jordan as a llama dunking a basketball, Space Jam
| (1996)" https://labs.openai.com/s/mX4T5Iak8CMO1rPAmjRb7oyH
|
| At this point I'd experiment with more stylized/recognizable
| references or add a couple "effects" to polish up the results.
| turdnagel wrote:
| My current move is creating initial versions of images with
| Midjourney, which seems to be a bit more "free-spirited" (read:
| less _literal_, more flexible) and then using DALL-E's replace
| tool to fill in the weird looking bits. It works pretty well, but
| it's a multi-step process and requires you have pay for
| Midjourney and DALL-E.
| karaterobot wrote:
| I ran into this too. When I got my invite, I told a friend I
| would learn how to talk to DALL-E by having it make some concept
| art for the game he was designing. I ran through all of my free
| credits, and most of the first $15 bucket and never really got
| anything usable.
|
| Even when I re-used the _exact prompts_ from the DALL-E Prompt
| Book, I didn 't get anything near the level of quality and
| fidelity to the prompt that their examples did.
|
| I know it's not a scam, because it's clearly doing amazing stuff
| under the hood, but I went away thinking that it wasn't as
| miraculous as it was claimed to be.
| jfk13 wrote:
| I suspect that many of the "impressive" examples that we see
| from tools like this have been carefully selected by human
| curators. I'm sure it's not at the level of "monkeys +
| typewriters = Shakespeare [if you're sufficiently selective]",
| but the general idea is still applicable.
| grumbel wrote:
| Most of DALL-E2 output is great out of the box, the selection
| process is just fine tuning the results to create something
| the human in front of the computer likes. DALL-E2 can't
| mindread, so the image produced might not match what the
| human had in mind.
|
| There is however one thing to be aware of, the titles posted
| on /r/dalle2/ and other places are often not the prompts that
| DALL-E2 got. Instead they are a fun description of the image
| done by a human after the fact. Random example:
|
| "Chased by an amongus segway"
|
| * https://www.reddit.com/r/dalle2/comments/wkv7za/chased_by_a
| n...
|
| But the actual prompt was:
|
| "Award winning photo of a mole driving a red off road car
| through a field"
|
| * https://labs.openai.com/s/xnaoxiWeSjiQX1QyVUCHGkl1
|
| Which is quite a bit less impressive, as the actual prompt
| doesn't really match the image very well. And if you put
| "Chased by an amongus segway" into DALL-E2, you won't get an
| image of that quality either.
| coldcode wrote:
| It's fun to play around with it, but like the author found, what
| you get is often strange or useless. I also find 1k images too
| small to do much with but I realize making 4k images would be
| cost prohibitive. I also wish it could generate vector images as
| well as pixel images. That would be fun to use.
| jordanmorgan10 wrote:
| A lot of these posts showing up on HN. I wonder - is it because
| it is so new, or is it because the ways in which we are to use
| this technology are so nascent that we are discovering how to use
| it more precisely daily?
| dougmwne wrote:
| I believe it's for a few reasons. First, it is jaw dropping
| incredible for most people in tech who have at least a hint of
| how most ML works. Second, the AI image generation field is
| racing ahead, in academics and new trained models, so there's
| lots of new news. Thirdly some really great models like Dall-e
| have been opened for wider access and lots of everyday users
| are discovering its capabilities and doing blog write-up's
| which are not news, but are surely interesting to most.
| pleasantpeasant wrote:
| There was a thread on r/DigitalArt about people debating if
| you're really an artist if you're using these AI creator
| websites.
|
| Some guy spent hours feeding the AI pictures he liked to get an
| end result he was happy with.
___________________________________________________________________
(page generated 2022-08-11 23:00 UTC)