[HN Gopher] DALL*E now available in beta
___________________________________________________________________
DALL*E now available in beta
Author : todsacerdoti
Score : 552 points
Date : 2022-07-20 16:30 UTC (6 hours ago)
(HTM) web link (openai.com)
(TXT) w3m dump (openai.com)
| naillo wrote:
| I wonder if they'll even make back what they spent on training
| the models before competitors of equal quality and lower cost
| eats up their margins.
| tourist_on_road wrote:
| Super impressive to see how OpenAI managed to bring the project
| from research to production (something usable for creatives).
| This is non trivial since the usecase involves filtering NSFW
| content, reducing bias in generated images. Kudos to the entire
| team.
| seshagiric wrote:
| For those who want to try DALL.E but do not have access yet, this
| is good play site: https://www.craiyon.com/
| totetsu wrote:
| I was really enjoying using Dalle2 to take surrealist walks
| around the latent image space of human cultural production. I was
| using it as one might use Wikipedia researching the links between
| objects and their representation. Also just to generate
| suggestion for what to have for lunch. None of this was for
| anything of commercial value to me. What am I to do now, start to
| find ways to sell the images I'm outputting? Do I displace the
| freelance artists in the market who actually have real talent and
| ability to create images and compositions and who studied how use
| the tools of the trade. Does the income artists can make now get
| displaced by people using dalle? Then do people stop learning how
| to actually make art and we come to the end of new cultural
| production and just start remixing everything made untill now?
| totetsu wrote:
| With real artist left only making images of sex and violence
| and other TOS violations
| [deleted]
| password321 wrote:
| The worlds most expensive meme generator.
| cypress66 wrote:
| > Reducing bias: We implemented a new technique so that DALL*E
| generates images of people that more accurately reflect the
| diversity of the world's population. This technique is applied at
| the system level when DALL*E is given a prompt about an
| individual that does not specify race or gender, like "CEO."
|
| Will it do it "more accurately" as they claim? As in, if 90% of
| CEOs are male, then the odds of a CEO being male in a picture is
| 90%? Or less "accurately reflect the diversity of the world's
| population" and show what they would like the real world to be
| like?
| president wrote:
| Most likely this was something forced by their marketing team
| or their office of diversity. Given the explanation of the
| implementation (arbitrarily adding "black" and "female"
| qualifiers), it's clear it was just an afterthought.
| [deleted]
| klohto wrote:
| hardmaru on Twitter has examples. It's the second, the one they
| would like it to be.
| kache_ wrote:
| They literally just add "black" and "female" with some weight
| before any prompt containing person.
|
| A comical work around to so called "bias" (isn't the whole
| point of these models to encode some bias?). Here's some
| experimentation showing this.
|
| https://twitter.com/rzhang88/status/1549472829304741888
|
| As competitors with lower price points prop up, you'll see
| everyone ditch models with "anti bias" measures and take their
| $ somewhere else. Or maybe we'll get some real solution, that
| adds noise to the embeddings, and not some half assed
| workaround to the arbitrary rules that your resident AI
| Ethicist comes up with.
| danielvf wrote:
| Add _after_. So you can see the added words by making a
| prompt like "a person holding a sign saying ", and then the
| sign says the extra words if they are added.
| kache_ wrote:
| Yeah actually, good call. The position of the token
| matters, since these things use transformers to encode the
| embeddings.
|
| https://www.assemblyai.com/blog/how-imagen-actually-works/
| whywhywhywhy wrote:
| How does it deal with bias that is negative?
|
| Would only work for positive biases where if they actually
| want to equalize it then it needs to be adding the opposite
| to negative biases.
|
| To counteract the bias of their dataset they need to have
| someone sitting there actively thinking in bias to counteract
| the bias with anti-bias seasoning for every bias causing
| term. Feel bad for whatever person is tasked with that job.
|
| Could always just fix your dataset, but who's got time and
| money to do that /s
| naillo wrote:
| It's also funny that this likely won't 'unbias' any actual
| published images coming out of it. If 90% of the images in the
| world has a male CEO, then for whatever reason that's the image
| people will pick and choose from DALL-Es output. (Generalized
| to any unbiasing - i.e. they'll be debiased by humans.)
| bequanna wrote:
| Imagine you're in South Korea (or any other ethnically
| homogenous country). Do you want "black" "female" randomly
| appended to your input?
| educaysean wrote:
| If I was using this in South Korea, how is showing all
| white people any better than showing whites, blacks,
| latinos and asians?
| bequanna wrote:
| You would presumably input "South Korean CEO". DALL-E
| would then unhelpfully add "black" "female" without your
| knowledge.
| educaysean wrote:
| I just tried it out and it looks like DALL-E isn't as
| inept as you imagined. Exact query used was 'A profile
| photo of a male south korean CEO', and it spat out 4 very
| believable korean business dudes.
|
| Supplying the race and sex information seems to prevent
| new keywords from being injected. I see no problem with
| the system generating female CEOs when the gender
| information is omitted, unless you think there are?
| astrange wrote:
| I don't think they "randomly insert keywords" like people
| are claiming, I think they probably run it through a GPT3
| prompt and ask it to rewrite the prompt if it's too
| vague.
|
| I set up a similar GPT prompt with a lot more power
| ("rewrite this vague input into a precise image
| description") and I find it much more creative and useful
| than DALLE2 is.
| bequanna wrote:
| Isn't the diversity keyword injection random?
|
| My point is that it is pointless. If you want an image of
| a <race> <gender> person included, you can just specify
| it yourself.
| educaysean wrote:
| > If you want an image of a <race> <gender> person
| included, you can just specify it yourself.
|
| I agree wholeheartedly. So what are we arguing about?
|
| What we're seeing is that DALL-E has its own bias-
| balancing technique it uses to nullify the imbalances it
| knows exists in its training data. When you specify
| ambiguous queries it kicks into action, but if you wanted
| male white CEOs the system is happy to give it to you.
| I'm not sure where the problem is.
| totetsu wrote:
| Yes the quality of surrealist generations went down with that
| change suddenly including gender and race into prompts that I
| really didn't want anything specific in. Like a snail radio DJ,
| and suddenly the microphone is a woman of colours head.. I
| understand the intention but I want this to be a default on but
| you can turn it off thing.
| TheFreim wrote:
| It's also odd since you'd think that this would be an issue
| solved by training with representative images in the first
| place.
|
| If you used good input you'd expect an appropriate output, I
| don't know why manual intervention would be necessary unless
| it's for other purposes than stated. I suspect this is another
| case where "diversity" simply means "less whites".
| StrictDabbler wrote:
| If accurately reflects the _world_ population then only one in
| six pictures will be a white person. Half the pictures will be
| Asian, another sixth will be Indian.
|
| Slightly more than half of the pictures will be women.
|
| That accurately represents the world's diversity. It won't
| accurately reflect the world's power balance but that doesn't
| seem to be their goal.
|
| If you want to say "white male CEO" because you want results
| that support the existing paradigm it doesn't sound like
| they'll stop you. I can't imagine a more boring request.
|
| Let's look at _interesting_ questions:
|
| If you ask for "victorian detective" are you going to get a
| bunch of Asians in deerstalker caps with pipes?
|
| What about Jedi? A lot of the Jedi are blue and almost nobody
| on Earth is.
|
| Are cartoon characters exempt from the racial algorithm? If I
| ask for a Smurf surfing on a pizza I don't think that making
| the Smurf Asian is going to be a comfortable image for any
| viewer.
|
| What about ageism? 16% of the population is over sixty. Will a
| request for "superhero lifting a building" have an 16% chance
| of being old?
|
| If I request a "bad driver peering over a steering wheel" am I
| still going to get an Asian 50% of the time? Are we ok with
| that?
|
| I respect the team's effort to create an inclusive and
| inoffensive tool. I expect it's going to be hard going.
| bequanna wrote:
| > inoffensive tool.
|
| Wouldn't that result end up being like "inoffensive art" or
| "inoffensive comedy"?
|
| Bland, boring and Corporate-PC.
| erikpukinskis wrote:
| Being offensive is only one way to be interesting.
|
| There are others, like being clever, or being absurd, or
| being goofy, or being poignant, or being refreshing.
|
| Of the good stuff, offensive humor is only a tiny slice.
| jazzyjackson wrote:
| offensive _to whom_ is the sticking point when it comes
| to comedy
|
| it takes a special talent to please everybody
| driverdan wrote:
| To a certain degree, yes. They care more about the image of
| the project than art. Considering a large amount of art
| depicts non-sexual nudity yet they block all nudity, art is
| not their primary concern.
| bequanna wrote:
| Some people claim to be emotionally "triggered" by images
| of police. Does that mean DALL-E should also start
| blocking images that contain police?
| visarga wrote:
| You know a surprising way to solve the issues you presented?
| You train another model to trick DALL-E to generate
| undesirable images. It will use all its generative skills to
| probe for prompts. Then you can use those prompts to fine-
| tune the original model. So you use generative models as a
| devil's advocate.
|
| - Red Teaming Language Models with Language Models
|
| https://arxiv.org/abs/2202.03286
| bjt2n3904 wrote:
| Will it reduce bias across all fields? Or only ones that are
| desirable? How about historical?
|
| "A photo of a group of soldiers from WW2 celebrating victory
| over nazi CEOs and plumbers".
| noelsusman wrote:
| In their examples, the "After mitigation" photos seem more
| representative of the real world. Before you got nothing but
| white guys for firefighter or software engineer and nothing but
| white ladies for teacher. That's not how the real world
| actually is today.
|
| I'm not sure how they would accomplish 100% accurate
| proportions anyway, or even why that would be desirable. If I
| don't specify any traits then I want to see a wide variety of
| people. That's a more useful product than one that just gives
| me one type of person over and over again because it thinks
| there are no female firefighters in the world.
| scifibestfi wrote:
| The latter. Here's what we, a small number of people, think the
| world should look like according to our own biases and
| information bubble in the current moment. We will impose our
| biases upon you, the unenlightened masses who must be
| manipulated for your own good. And for god sakes, don't look
| for photos of the US Math team or NBA Basketball or compare
| soccer teams across different countries and cultures.
| bequanna wrote:
| > Here's what we, a small number of people, think the world
| should look like according to our own biases and information
| bubble in the current moment.
|
| You're being quite charitable. It is much more likely that
| optics and virtue signaling is behind this addition.
| erikpukinskis wrote:
| If I search for "food" I don't want to see a slice of pizza
| every time, even if that's the #1 food. I want to see some
| variety.
|
| I think you're jumping to quickly to bad intentions.
| Injecting diversity of results is a sane thing to do,
| totally irrespective of politics.
| aledalgrande wrote:
| I wonder at this price point which kind of business can use DALL
| E at scale?
| hit8run wrote:
| It's so dirty what Microsoft is doing here. They ripped the tech
| out of developers hands just to sell us drips of it. Drips that
| are not enough to build a product for more than a few people.
| They require to check on the use before launching etc. I truly
| hate this company, their shitty operating system and their
| monopoly business game. Everything they buy turns to shit. And
| don't tell me about VSCode. It's just a trap to fool developers.
| NaughtyShiba wrote:
| Slightly offtopic, but how one would report false-positive check
| in content policy check?
| Al-Khwarizmi wrote:
| In beta, maybe, but I don't think "available" means what they
| think it means.
|
| I have been on the waitlist from the very beginning. Still
| waiting.
| skilled wrote:
| I can't check right now but this mean the watermark is also gone
| and images will have a higher resolution?
| gverri wrote:
| Watermarks are still there and resolution still 1024x1024.
| skilled wrote:
| I wonder if they have plans to allow SVG exports in the
| future. I mean, the file size would probably be ridiculous in
| a lot of the cases, but for my use case I wouldn't mind it.
| And sucks about the watermark, maybe they will introduce an
| option to pay for removing it.
| rahimnathwani wrote:
| SVG exports would only be meaningful if the model is
| generating vector images, which are then converted to
| bitmaps. I highly doubt that's the case, but perhaps
| someone who has actually looked at the model structure can
| confirm?
| tiagod wrote:
| It's just pixels. You can pass them into a tracer
| moyix wrote:
| SVG isn't really possible with the model architecture
| they're using. The diffusion+upscaling step basically
| outputs 1024x1024 pixels; at no point does the model have a
| vector representation.
|
| I suppose it's possible that at some point they'll try to
| make an image -> svg translation model?
| [deleted]
| xnx wrote:
| I fully expect stock image sites to be swamped by DALL-E
| generated images that match popular terms (e.g. "business person
| shaking hands"). Generate the image for $0.15. Sell it for $1.00.
| smusamashah wrote:
| They won't. DALL-E images are mostly not as high quality. The
| high quality stuff which everyone has been sharing is result of
| lots of cherry picking.
| commandlinefan wrote:
| Even the high quality stuff still can't do human faces right.
| TomWhitwell wrote:
| This one surprised me when it came out, felt more 'human'
| than lots of stock photos:
| https://labs.openai.com/s/AsRKFiOKJmmZrVDxIGa75sSA
| optimalsolver wrote:
| They avoided using real human faces in the training data.
| speedgoose wrote:
| In my experience it doesn't require that much cherry picking
| if you use a carefully crafted prompt. For example: " A
| professional photography of a software developer talking to a
| plastic duck on his desk, bright smooth lighting, f2.2,
| bokeh, Leica, corporate stock picture, highly detailed"
|
| And this is the first picture I got:
| https://labs.openai.com/s/lSWOnxbHBYQAtli9CYlZGqcZ
|
| It got it a bit strong on the depth of field and I don't like
| the angle but I could iterate a few times and get a good one.
| arecurrence wrote:
| Additionally, wherever it classically falls over (such as
| currently for realistic human faces), there will be second
| pass models that both detect and replace all the faces with
| realistic ones. People are already using models that alter
| eyes to be life-like with excellent results (many of the
| dalle-2 ones appear somewhat dead atm).
| smusamashah wrote:
| Even this image is just an illusion of a perfect photo,
| which is a blur for most part, see the face of duck. I had
| access since past 4 5 days and it fails badly whenever I
| tried to create any unusual scene.
|
| For the first few days when it was announced I use to look
| deep even in real photos in search of generative artifacts.
| They are not so difficult to spot now, most of the times
| anyway.
| cornel_io wrote:
| NB: when you share links like that, nobody who doesn't have
| access can see the results
| alana314 wrote:
| sure they can, just tried in incognito
| messe wrote:
| If the price is low enough, you can have humans rank
| generated images (maybe using Mechanical Turk or a similar
| service), and from that ranking choose only the highest
| quality DALL-E generated images.
| Forge36 wrote:
| If someone can make money doing it they might.
|
| Heck: If the cost to entry is prohibitively low they might do
| it at a loss and take over the site
| redox99 wrote:
| DALL-E 2 isn't good enough for such photorealistic pictures
| with humans as of yet however.
| arecurrence wrote:
| There has been trouble with generating life-like eyes but a
| second pass with a model tuned around making realistic faces
| has been very successful at fixing that.
| bpicolo wrote:
| https://twitter.com/TobiasCornille/status/154972906039745331.
| ..
|
| Unless I'm missing something, these seem pretty darn good
| zerocrates wrote:
| Woof, that bias "solution" that that thread is actually
| about though...!
| thorum wrote:
| DALLE images are still only 1024 px wide. Which has its uses,
| but I don't think the stock photo industry is in real danger
| until someone figures out a better AI superresolution system
| that can produce larger and more detailed images.
| [deleted]
| eigenvalue wrote:
| I've been using this app to upscale the images to 4000x4000,
| and it works amazingly well (there is also a version for
| Android):
|
| https://apps.apple.com/us/app/waifu2x/id1286485858
|
| I paid extra to get the higher quality model using the in-app
| purchase option. It crushes the phone's battery life, but
| runs in only ~10 seconds on an iPhone 13 Pro for a single
| 1000x1000 input image.
| ZeWaka wrote:
| I mean, waifu2x and similar waifuxx libraries are free and
| open-source, there's really no reason to pay for it if
| you're working on a desktop.
| [deleted]
| arecurrence wrote:
| You can obtain any size by using the source image with the
| masking feature. Take the original and shift it then mask out
| part of the scene and re-run. Sort of like a patchwork quilt,
| it will build variations of the masked areas with each
| generation.
|
| Once the API is released, this will be easier to do in a
| programmatic fashion.
|
| Note: Depending on how many times you do this... I could see
| there being a continuity problem with the extremes of the
| image (eg: the far left has no knowledge of the far right).
| An alternative could be to scale the image down and mask the
| borders then later scale it back up to the desired
| resolution.
|
| This scale and mask strategy also works well for images where
| part of the scene has been clipped that you want to include
| (EG: Part of a character's body outside the original image
| dimensions). Scale the image down, then mask the border
| region, and provide that to the generation step.
| ploppyploppy wrote:
| "buy fo' a dollar, sell fa' two" - Prop. Joe
| wishfish wrote:
| Makes me imagine stock image sites in the near future. Where
| your search term ("man looks angrily at a desktop computer")
| gets a generated image in addition to the usual list of stock
| photos.
|
| Maybe it would be cheaper. I imagine it would one day. And
| maybe it would have a more liberal usage license.
|
| At any rate, I look forward to this. And I look forward to the
| inevitable debates over which is better: AI generation or
| photographer.
| dymk wrote:
| They'll likely immediately go out of business, because I can
| just pay OpenAI 15 cents directly for the exact same product.
| dylanlacom wrote:
| Eh, I'd bet the arbitrage window is pretty brief, and that
| prices will fall closer to $0.15 pretty quickly.
| jowday wrote:
| Sad to say I've been dissapointed in DALLE's performance since I
| got access to it a couple of weeks ago - I think mainly because
| it was hyped up as the holy grail of text2image ever since it was
| first announced.
|
| For a long while whenever Midjourney or DALLE-mini or the other
| models underperformed or failed to match a prompt the common
| refrain seemed to be "ah, but these are just the smaller version
| of the real impressive text2image models - surely they'd perform
| better on this prompt". Honestly, I don't think it performs
| dramatically better than DALLE-mini or Midjourney - in some cases
| I even think DALLE-mini outperforms it for whatever reason. Maybe
| because of filtering applied by OpenAI?
|
| What difference there is seems to be a difference in quality on
| queries that work well, not a capability to tackle more complex
| queries. If you try a sentence involving lots of relationships
| between objects in the scene, DALLE will still generate a
| mishmash of those objects - it'll just look like a slightly
| higher quality mishmash than from DALLE-mini. And on queries that
| it does seem to handle well, there's almost always something off
| with the scene if you spend more than a moment inspecting it. I
| think this is why there's such a plethora of stylized and
| abstract imagery in the examples of DALLE's capabilities - humans
| are much more forgiving of flaws in those images.
|
| I don't think artists should be afraid of being replaced by
| text2image models anytime soon. That said, I have gotten access
| to other large text2image models that claim to outperform DALLE
| on several metrics, and my experience matched with that claim -
| images were more detailed and handled relationships in the scene
| better than DALLE does. So there's clearly a lot of room for
| improvement left in the space.
| jawns wrote:
| One of the commercial use cases this post mentions is authors who
| want to add illustrations to children's stories.
|
| I wonder if there is a way for DALL-E to generate a character,
| then persist that character over subsequent runs. Otherwise, it
| would be pretty difficult to generate illustrations that depict a
| coherent story.
|
| Example ...
|
| Image 1 prompt: A character named Boop, a green alien with three
| arms, climbs out of its spaceship.
|
| Image 2 prompt: Boop meets a group of three children and shakes
| hands with each one.
| minimaxir wrote:
| You can cheat this to a limited extent using inpainting.
| rahimnathwani wrote:
| You mean just generate a single large image with all the
| stuff you want for the whole story, and then use cropping and
| inpainting to get only the piece you want for each page?
| TaupeRanger wrote:
| You can't do that. I can't see this working well for children's
| book illustrations unless the story was specifically tailored
| in a way that makes continuity of style and characters
| irrelevant.
| CobrastanJorji wrote:
| As an aside, Ursula Vernon did pretty well under the
| constraint you described. She set a comic in a dreamscape and
| used AI to generate most of the background imagery:
| https://twitter.com/UrsulaV/status/1467652391059214337
|
| It's not the "specify the character positions in text"
| proposed, but still a neat take on using this sort of AI for
| art.
| TaupeRanger wrote:
| Nice example and very well done. But yeah, very niche
| application unfortunately.
| WalterSear wrote:
| I would expect continuuity to be a relatively simple feature
| to retrain for and implement.
| bergenty wrote:
| You cannot. But a workaround would be to say something like
| "generate an alien in three different poses-- running, walking,
| waving"
|
| Then use inpainting to only preserve that pose and generate new
| content around it. It's definitely not perfect.
| londons_explore wrote:
| You can do better than this. Draw/generate your character.
|
| Then put that at the side of a transparent image, and use as
| the prompt, "Two identical aliens side by side. One is
| jumping"
| can16358p wrote:
| So can we now legally remove the "color blocks" watermark or not?
|
| What about generating NFTs? It was explicitly prohibited during
| the previous period, now there is no notion of it. Without notion
| and rights for commercial use I think it's allowed but because it
| was an explicitly forbidden use case before, I want to be sure
| whether it can be used or not.
|
| Regardless, excited to see what possibilities it opens.
| gwern wrote:
| Another user saying that OA has said it's OK to remove the
| watermark:
| https://www.reddit.com/r/dalle2/comments/w3qsxd/dalle_now_av...
|
| The commercial use language appears pretty clear to me to allow
| NFTs. (But note the absence of any discussion of _derivative_
| works...)
| blintz wrote:
| The content policy is strikingly puritanical:
|
| > "Do not attempt to create, upload, or share images that are not
| G-rated"
|
| https://labs.openai.com/policies/content-policy
| anewpersonality wrote:
| Feel sorry for the full time artists.
| danielvf wrote:
| I am thrilled about DALL-E, and the new terms of service.
| However, how they implemented the improved "diversity" is
| hilarious.
|
| Turns out that they randomly, silently modify your prompt text to
| append words like "black male" or "female". See
| https://twitter.com/jd_pressman/status/1549523790060605440
|
| I don't know which emotion I feel more - applause at how glorious
| this hack is or tears at how ugly it is.
|
| Good luck to them!
| time_to_smile wrote:
| This is funny because I work on a team that is using GPT-3 and
| to fix a variety of issues we have with incorrect output we've
| just been having the engineering team prepend/append text to
| modify the query. As we encounter more problems the team keeps
| tacking on more text to the query.
|
| This feels like a very hacky way to essentially reinvent
| programming badly.
|
| My bet is that in a few years or so only a small cohort of
| engineering and product people will even remember Dall-E and
| GTP-3 and someone cringe at how we all thought this was going
| to be a big thing in the space.
|
| There's are both really fascinating novelties, but at the end
| of the day that's all they are.
| throwaway4aday wrote:
| How else would you specify the type of image you would like?
| Surely, if you were hiring a designer you would provide them
| with a detailed description of what you wanted. More likely,
| you would spend a lot of time with them maybe even hours and
| who knows how many words. For design work specifically to
| create a first mockup or prototype of a product or image it
| seems like DALL-E beats that initial phase hands down. It's
| much easier to type in a description and then choose from a
| set of images than it is to go back and forth with someone
| who may take hours or days to create renderings of a few
| options. I don't think it'll put designers out of work but I
| do think they'll be using it regularly to boost their
| productivity.
| selestify wrote:
| What are you using GPT-3 for in a commercial setting?
| mysore wrote:
| it's a hard problem. at least they tried.
| Jerrrry wrote:
| It's not a "problem," it's an unwanted shard of reality
| piercing through an ideological guise.
| gnulinux wrote:
| How's it NOT a problem? If I'm trying to produce "stock
| people images", and if it only gives me white men, it's
| clearly broken because when I ask for "people" I'm actually
| asking for "people". I'm having difficulty understanding
| how it can be considered to be working as intended, when it
| literally doesn't. Clearly, the software has substantial
| bias that gets in way of it accomplishing its task.
|
| If I want to produce "animal images" but it only produces
| images of black cats, do you think there is any question
| whether it's a problem or not?
| mysterydip wrote:
| That's what Jerrrry is saying. Framing the reality of
| diversity in the world as a "problem" is wrong.
| ceeplusplus wrote:
| Black people comprise 12.4% of the US population, yet
| they are represented at substantially above that in
| "OpenAI"'s "bias removal" process. Clearly it has, as you
| put it, substantial bias that gets in the way of
| accomplishing its task.
| Jerrrry wrote:
| That is clearly overfitting due to unrepresentative
| training data.
|
| The "issue" is a different one: that training data - IE,
| reality, has _unwanted_ biases in it, because reality is
| biased.
|
| Producing images of men when prompting for "trash
| collecting workers" should not be much of a surprise: 99%
| of garbage collection/refuse is handled by men. I doubt
| most will consider this a "problem," because of one's own
| bias, nobody cares about women being represented for a
| "shitty" job.
|
| But ask for picture of CEOs, and then act surprised when
| most images are of white men? Only outrage, when
| proportionally, CEO's are, on average, white men.
|
| The "problem" arises when we use these tools to make
| decisions and further affect society - it has the obvious
| issue of further entrenching stereotypical associations.
|
| This is not that. Asking DALLE for a bunch of football
| players, would expectedly produce a huddled group of
| black men. No issue, because the NFL are
| disproportionately black men. No outrage, either.
|
| Asking DALLE for a group of criminals, likewise, produces
| a group of black men. Outage! Except statistically, this
| is not a surprise, as a disproportionate amount of
| criminals are black men.
|
| The "problem" is with reality being used as training
| data. The "problem" is with our reality, not the tooling.
|
| Except in the cases where these toolings are being used
| to affect society - the obvious example being insurance
| ML algorithms. et al - we should strive to fix the issues
| present in reality, not hide them with handicapped
| training data, and malformed inputs.
| TomWhitwell wrote:
| In the UK... "The Environmental Services Association, the
| trade body, said that only 14 per cent of the country's
| 91,300 waste sector workers were female." So 2x dall-e
| searches should produce 1.2 women.
| CuriousSkeptic wrote:
| > Asking DALLE for a bunch of football players, would
| expectedly produce a huddled group of black men
|
| I think, for about 95% of the world football is
| synonymous with soccer. Its kind of interesting that you
| take this particular example to represent what reality
| looks like statistically
| less_less wrote:
| > This is not that. Asking DALLE for a bunch of football
| players, would expectedly produce a huddled group of
| black men. No issue, because the NFL are
| disproportionately black men. No outrage, either.
|
| This is not great. Only about 57% of NFL players are
| black, and the percentage is more like 47% among college
| players. It would be better to at least reflect the
| diversity of the field, even if you don't think it should
| be widened in the name of dispelling stereotypes.
|
| > Asking DALLE for a group of criminals, likewise,
| produces a group of black men. Outage! Except
| statistically, this is not a surprise, as a
| disproportionate amount of criminals are black men.
|
| Only about 1/3 of US prisoners are black. (Not quite the
| same as "criminals" but of course we don't always know
| who is committing crimes, only who is charged or
| convicted.) That's disproportionate to their population,
| but it's not even close to a majority. If DALLE were to
| exclusively or primarily return images of black men for
| "criminals", then it would be reinforcing a harmful
| stereotype that does not reflect reality.
| stuckinhell wrote:
| Everything is an ideological war zone now. That's the world
| we live in now.
| Fnoord wrote:
| Perhaps its a problem you don't care about?
| ketzo wrote:
| serious question: in what way is that not a "problem?"
| TheFreim wrote:
| It's not a problem in a few ways, let me know what you
| think (feel free to ask for clarification).
|
| 1. The training data would've been the best way to get
| organic results, the input is where it'd be necessary to
| have representative samples of populations.
|
| 2. If the reason the model needs to be manipulated to
| include more "diversity" is that there wasn't enough
| "diversity" in the training set then its likely the
| results will be lower quality
|
| 3. People should be free to manipulate the results how
| they wish, a base model without arbitrary manipulations
| of "diversity" would be the best starting point to allow
| users to get the appropriate results
|
| 4. A "diverse" group of people depends on a variety of
| different circumstances, if their method of increasing it
| is as naive as some of the are claiming this could result
| in absurdities when generating historical images or
| images relating to specific locations/cultures where
| things will be LESS representative
| bobcostas55 wrote:
| Well, it's a problem for the ideology.
| kache_ wrote:
| While their heart is in the right place, I'd like to
| challenge the idea that certain groups are so fragile that
| they don't understand that historically, there are more
| pictures of certain groups doing certain things.
|
| It's a hard problem for sure. But remember, the bias ends
| with the user using the tool. If I want a black scientist, I
| can just say "black scientist".
|
| Let _me_ be mindful of the bias, until we have a generally
| intelligent system that can actually do it. I 'm generally
| intelligent too, you know.
| micromacrofoot wrote:
| Historically this is true, but it also seems dangerous to
| load up these algorithms with pure history because they'll
| essentially codify and perpetuate historical problems.
| UmYeahNo wrote:
| >But remember, the bias ends with the user using the tool.
| If I want a black scientist, I can just say "black
| scientist".
|
| That is a really, _really_ , narrow viewpoint. I think what
| people would prefer is that if you query "Scientist" that
| the images returned are as likely to be any combination of
| gender and race. It's not that a group is "fragile", it's
| that they have to specify race and gender at all, when that
| specificity is not part of the intention. It seems that
| they recognize that querying "Scientist" will predominantly
| skew a certain way, and they're trying in some way to
| unskew.
|
| Or, perhaps, you'd rather that the query be really, really
| specific? like: "an adult human of any gender and any race
| and skin color dressed in a laboratory coat...", but I
| would much rather just say "a scientist" and have the
| system recognize that _anyone_ can be a scientist.
|
| And then if I need to be specific, then I would be happy to
| say "a black-haired scientist"
| numpad0 wrote:
| Kind of funny that NN tech is supposed to construct some
| upper dimensional understanding, yet realistically cannot
| be expected to be able to generate gender and race
| indeterminate portrayal of a scientist.
| kache_ wrote:
| This is a problem with generative models across the
| board. It's important that we don't skew our perceptions
| by GAN outputs as a society, so it's definitely good that
| we're thinking about it. I just wish that we had a
| solution that solved across the class of problems
| "Generative AI feeds into itself and society (which is in
| a way, a generative AI), creating a positive feedback
| loop that eventually leads to a cultural freeze"
|
| It's way bigger than just this narrow race issue the
| current zeitgeist is concerned about.
|
| But I agree, maybe I should skew to being optimistic that
| at least we're _trying_
| throwaway4aday wrote:
| Have you seen the queries that are used to generate
| actually useful results rather than just toy
| demonstrations? They look a lot more like your first
| example except with more specificity. It'd be more like
| "an adult human of any gender and any race and skin color
| dressed in a laboratory coat standing by a window holding
| a beaker in the afternoon sun. 1950s, color image, Canon
| 82mm f/3.6, desaturated and moody." so if instead you are
| looking for an image with a person of a specific
| ethnicity or gender then you are for sure going to add
| that in along with all of the details. If you are instead
| worried about the bias of the person choosing the image
| to use then there is nothing short of restricting them to
| a single choice that will fix that and even in that case
| they would probably just not use the tool since it wasn't
| satisfying their own preferences.
| protonbob wrote:
| Honestly I would rather that they not try. I don't understand
| why a computer tool has to be held to a political standard.
| daemoens wrote:
| It's not a political standard though. There is actual
| diversity in this world. Why wouldn't you want that in your
| product?
| [deleted]
| mensetmanusman wrote:
| Fix the data input side, not the data output side. The
| data input side is slowly being fixed in real time as the
| rest of the world gets online and learns these methods.
| throwaway4aday wrote:
| In a sane world we would be able to tack on a disclaimer
| saying "This model was trained on data with a majority
| representation of caucasian males from Western English
| speaking countries and so results may skew in that
| direction" and people would read it and think "well, duh"
| and "hey let's train some more models with more data from
| around the world" instead of opining about systemic
| racism and sexism on the internet.
| astrange wrote:
| That wouldn't necessarily fix the issue or do anything. A
| model isn't a perfect average of all the data you throw
| into its training set. You have to actually try these
| things and see if they work.
| norwalkbear wrote:
| I agree, the trust is broken now. Im going to skip on any
| AI that pulls that crap.
| Jerrrry wrote:
| There are legitimate reasons to reduce externalizations of
| societies innate biases.
|
| A mortgage AI that calculates premiums for the public
| shouldn't bias against people with historically black
| names, for example.
|
| This problem is harder to tackle because it is difficult to
| expose and resign the "latent space" that results in these
| biases; it's difficult to massage the ML algo's to identify
| and remove the pathways that result in this bias.
|
| It's simply much easier to allow the robot to be
| bias/racist/reflective of "reality" (its training data),
| and add a filter / band-aid on top; which is what they've
| attempted.
|
| when this is appropriate is the more cultured question; I
| don't think we should attempt to band-aid these models, but
| for more socially-critical things, it is definitely
| appropriate.
|
| It's naive on either extreme: do we reject reality, and
| substitute or own? Or do we call our substitute reality,
| and hope the zeitgeist follows?
| ceeplusplus wrote:
| That's great, but by doing so you are also inadvertently
| favoring, in your example, the people with black names.
| For example, Chinese people save on average, 50 times
| more than Americans according to the Fed [1]. That would
| mean they would generally be overrepresented in loan
| approvals because they have a better balance sheet. Does
| that necessarily mean that Americans are discriminated
| against in the approval process? No.
|
| My question to you is: is an algorithm that takes no
| racial inputs (name, race, address, etc) yet still
| produces disproportionate results biased or racist? I say
| no.
|
| [1] https://files.stlouisfed.org/files/htdocs/publication
| s/es/08...
| Jerrrry wrote:
| I would agree that it is not.
|
| The government, and many people, have moved the
| definition and goal posts; so that anything that has the
| end result of a non-proportional uniformity can be
| labeled and treated as bias.
|
| Ultimately it is a nuanced game; is discriminating
| against certain clothing or hair-styles racist? Of
| course. Yet, neither of those are explicitly tied to
| one's skin color or ethnicity, but are an indirect
| associative trait because of culture.
|
| In America, we have intentionally muddled the waters of
| demarcation between culture and race, and are starting to
| see the cost of that.
| mh- wrote:
| _> A mortgage AI that calculates premiums for the public
| shouldn 't bias against people with historically black
| names, for example._
|
| That's a great example, thanks. Also, I hope the teams
| working on that come up with a different solution...
| [deleted]
| [deleted]
| tablespoon wrote:
| > Turns out that they randomly, silently modify your prompt
| text to append words like "black male" or "female".
|
| I wonder what the distribution of those modifications is?
| Hard_Space wrote:
| Today, when DALL-E was still free, my Dad asked me to try a
| prompt about the Buddha sitting by a river, contemplating. I
| did about 4 prompt variations, and one of them was an Asian
| female, if that gives any idea about the frequency (I should
| note that the depiction was of a young, slim, and attractive
| female Buddha, so I'm not sure they have the bias thing
| licked just yet).
| speedgoose wrote:
| In my little testing, diversity in ethnicities was achieved
| but not realistic given the context. I also got a few
| androgynous people as I asked for a male or a female and
| another gender was appended.
| Invictus0 wrote:
| A dumb solution to a dumber problem.
| tshaddox wrote:
| That Twitter thread is full of people saying "yeah that doesn't
| seem to be true at all" so I'm hesitant to jump to conclusions
| even if we're deciding to believe random tweets.
| causi wrote:
| Interesting. Considering this is now a paid product, is
| modifying user input covered by their ToS? If I was spending a
| lot of money on it I'd be rather annoyed my input was being
| silently polluted.
| zikduruqe wrote:
| Don't spend money. Use https://www.craiyon.com
| scott_s wrote:
| _[shudder]_
|
| I tried the first whimsical, benign thing I could think of:
| "indiana jones eating spaghetti." The results are clearly
| recognizable as that. But they are also a kaleidoscope of
| body horror; a Indiana Jones monster melted into Cthulu
| forms inhaling plates that are slightly _not_ spaghetti.
| bhaney wrote:
| This produces dramatically worse results in my experience.
| minimaxir wrote:
| Not worse, but different. It depends on the prompt but
| DALL-E mini/mega seems to do better then DALL-E 2 for
| certain types of absurd prompts, such as the ones in
| /r/weirddalle
| causi wrote:
| Yes, there are very sharp lines where it does and doesn't
| understand. It understands color and gender but not
| materials. I got very good outputs for "blue female
| Master Chief" but "starship enterprise made out of candy"
| was complete garbage.
| elcomet wrote:
| Definitely worse-quality. Maybe more diverse for some
| prompts yeah.
| kuprel wrote:
| This one is faster, I ported it
| https://replicate.com/kuprel/min-dalle
| minimaxir wrote:
| Additionally, it's also open-sourced on GitHub and can be
| self-hosted, with easy instructions to do so:
| https://github.com/kuprel/min-dalle
| practice9 wrote:
| Thankfully it doesn't introduce any researcher bias,
| doesn't ban people from using it on the basis of country,
| doesn't use your personal data like phone number...
|
| And the best of all - it does have a meme community around
| it, and you can always donate if you feel it adds value to
| your life
| kingkawn wrote:
| The racist pollution came long before this product was a
| glimmer in our eye.
| tptacek wrote:
| Your input isn't being polluted by this any more than it is
| when the tokens in it are ground up into vectors and
| transformed mathematically. You just have an easier time
| understanding this transformation.
| throwuxiytayq wrote:
| Obviously, it's polluted. Undisputably. In a mathematical
| sense, an extra (black box) transformation is performed on
| the input to the model. In a practical sense (eg. if you're
| researching the model), this is like having dirty
| laboratory tools - all measurements are slightly off. The
| presumption by OpenAI is that the measurements are _off in
| the correct way_.
|
| I'm interested in using Dall-E commercially, but I think
| some competitor offering sampling with raw input will have
| a better chance at my wallet.
| tptacek wrote:
| throwuxiytayq wrote:
| Yeah man, but literally the entire point of this AI
| picture generator is that it's, like, super _accurate_ at
| rendering the prompt, and stuff.
|
| I don't understand the relevance of the black box's
| scrutability - _I just want to play with the black box_.
| I am interested in increasing my understanding of the
| black box, not of a trust-me-it 's-great-our-intern-
| steve-made-it black box derivative.
| tptacek wrote:
| You should make your own black boxes then. By all means,
| send your dollars to whatever service passes your purity
| test; I'm just saying that the idea that DALL-E is
| "polluting" your input is risible. It's already polluting
| your data at, like, a subatomic level, at
| dimensionalities it hadn't even occurred to you to
| consider, and at enormous scale.
| bantou_41 wrote:
| Diversity = black now? That's even more racist.
| xyzzyz wrote:
| Diversity has meant exactly that all the way since Bakke.
| [deleted]
| konfusinomicon wrote:
| as far as I can tell, they also concatenate "On LSD" to every
| prompt as well.
| DecayingOrganic wrote:
| Since many people will start generating their first images soon,
| be sure to check out this amazing DALL-E prompt engineering book
| [0]. It will help you get the most out of DALL-E.
|
| [0]: https://dallery.gallery/wp-content/uploads/2022/07/The-
| DALL%... (PDF)
| ru552 wrote:
| nice write up, thanks
| uplifter wrote:
| Thanks for this! A bit of prompt engineering know-how will help
| me get the most bang for the buck out of this beta. I also just
| want to say that dallery.gallery is delightfully clever naming.
| ZeWaka wrote:
| This is absolutely amazing. Thanks!
| c0decracker wrote:
| Interesting. I got access couple weeks ago (was on waitlist since
| the initial announcement) and frankly as much as really want to
| be excited and like it, DALL-E ended up being a bit
| underwhelming. IMHO - often results that produced are of low
| quality (distorted images, or quite wacky representation of the
| query). Some styles of imagery are certainly a better fit for
| being generated by DALL-E, but as far as commercial usage I think
| it needs a few iterations and probably even larger underlying
| model.
| simonw wrote:
| This book has some very good, actionable advice on crafting
| prompts that get better results out of DALL-E:
| https://dallery.gallery/the-dalle-2-prompt-book/
| andybak wrote:
| I also got access a couple of weeks ago and I can't fathom how
| anyone could be underwhelmed by it.
|
| What were you expecting?
| c0decracker wrote:
| Fundamentally I have two categories of issues I see with
| DALL-E, but please don't get me wrong -- I think this is a
| great demonstration of what is possible with huge models and
| I think OpenAI work in general is fantastic. I will most
| certainly continue using both DALL-E and OpenAI's GPT3. (1)
| Between what DALL-E can do today and commercial utility is a
| rift in my opinion. I readily admit that I am have not done
| hundreds of queries (thank you folks for pointing that out,
| I'll practice more!) but that means that there is a learning
| curve, isn't it? I can't just go to DALL-E, mess with it for
| 5-10 minutes and get my next ad or book cover or illustration
| for my next project done? (2) I think DALL-E has issues with
| faces and human form in general. Images it produces are often
| quite repulsive and take the uncanny valley to the next
| level. I absolutely surprise myself when I noticed thinking
| that images with humans DALL-E produced lack of... soul? Cats
| and dogs on the other hand it handles much better. I done
| tests with other entities --- say cars or machinery -- and it
| generally performs so so with them too, often creating
| disproportionate representations of them or misplacing
| chunks. If you're querying for multiple objects on a scene it
| quite often melds them together. This is more pronounced in
| photorealistic renderings. When I query for painting-style it
| works mostly better. That said every now and then it does
| produce a great image, but with this way of arriving at it,
| how fast I'll have to replenish those credits?.. :)
|
| All in all though I think I am underwhelmed mostly because my
| initial expectations were off, I am still a fan of DALL-E
| specifically and GPT3 in general. Now when is GPT4 coming
| out? :)
| harpersealtako wrote:
| Dalle seems to only have a few "styles" of drawing that it is
| actually "good" at. It is particularly strong at these styles
| but disappointingly underwhelming at anything else, and will
| actively fight you and morph your prompt into one of these
| styles even when given an inpainting example of exactly what
| you want.
|
| It's great at photorealistic images like this:
| https://labs.openai.com/s/0MFuSC1AsZcwaafD3r0nuJTT, but it's
| intentionally lobotomized to be bad at faces, and often has
| an uncanny valley feel in general, like this:
| https://labs.openai.com/s/t1iBu9G6vRqkx5KLBGnIQDrp (never
| mind that it's also lobotomized to be unable to recognize
| characters in general). It's basically as close to perfect as
| an AI can be at generating dogs and cats though, but anything
| else will be "off" in some meaningful ways.
|
| It has a particular sort of blurry, amateur oil painting
| digital art style it often tries to use for any colorful
| drawings, like this:
| https://labs.openai.com/s/EYsKUFR5GvooTSP5VjDuvii2 or this:
| https://labs.openai.com/s/xBAJm1J8hjidvnhjEosesMZL . You can
| see the exact problem in the second one with inpainting: it
| utterly fails at the "clean" digital art style, or drawing
| anything with any level of fine detail, or matching any sort
| of vector art or line art (e.g. anime/manga style) without
| loads of ugly, distracting visual artifacts. Even Craiyon and
| DALLE-mini outperform it on this. I've tried over 100 prompts
| to get stuff like that to generate and have not had a single
| prompt that is able to generate anything even remotely good
| in that style yet. It seems almost like it has a "resolution"
| of detail for non-photographic images, and any detail below a
| certain resolution just becomes a blobby, grainy brush
| stroke, e.g. this one:
| https://labs.openai.com/s/jtvRjiIZRsAU1ukofUvHiFhX , the
| "fairies" become vague colored blobs here. It can generate
| some pretty ok art in very specific styles, e.g. classical
| landscape paintings:
| https://labs.openai.com/s/6rY7AF7fWPb5wWiSH0rAG0Rm , but for
| anything other than this generic style it disappoints _hard_.
|
| The other style it is ok at is garish corporate clip art,
| which is unremarkable and there's already more than enough
| clip art out there for the next 1000 years of our collective
| needs -- it is nevertheless somewhat annoying when it
| occasionally wastes a prompt generating that crap because you
| weren't specific that you wanted "good" images of the thing
| you were asking for.
|
| The more I use DALLE-2 the more I just get depressed at how
| much wasted potential it has. It's incredibly obvious they
| trimmed a huge amount of quality data and sources from their
| databases for "safety" reasons, and this had huge effects on
| the actual quality of the outputs in all but the most mundane
| of prompts. I've got a bunch more examples of trying to get
| it to generate the kind of art I want (cute anime art, is
| that too much to ask for?) and watching it fail utterly every
| single time. The saddest part is when you can see it's got
| some incredible glimpse of inspiration or creative genius,
| but just doesn't have the ability to actually follow through
| with it.
| napier wrote:
| GPT3 has seen similar lobotomization since its initial
| closed beta. Current davinci outputs tend to be quite
| reserved and bland, whereas when I first had the fortunate
| opportunity to experience playing with it in mid 2020, if
| often felt like tapping into a friendly genius with access
| to unlimited pattern recognition and boundless knowledge.
| harpersealtako wrote:
| I've absolutely noticed that. I used to pay for GPT-3
| access through AI Dungeon back in 2020, before it got
| censored and run into the ground. In the AI fiction
| community we call that "Summer Dragon" ("Dragon" was the
| name of the AI dungeon model that used 175B GPT-3), and
| we consider it the gold standard of creativity and
| knowledge that hasn't been matched yet even 2 years
| later. It had this brilliant quality to it where it
| almost seemed to be able to pick up on your unconscious
| expectations of what you wanted it to write, based purely
| on your word choice in the prompt. We've noticed that
| since around Fall 2020 the quality of the outputs has
| slowly degraded with every wave of corporate censorship
| and "bias reduction". Using GPT-3 playground (or story
| writing services like Sudowrite which use Davinci) it's
| plainly obvious how bad it's gotten.
|
| OpenAI needs to open their damn eyes and realize that a
| brilliant AI with provocative, biased outputs is better
| than a lobotomized AI that can only generate advertiser-
| friendly content.
| visarga wrote:
| So it got worse for creative writing, but it got much
| better at solving few-shot tasks. You can do information
| extraction from various documents with it, for example.
| napier wrote:
| I mean yes, you're right insofar as it goes. However
| nothing I am aware of implies technical reasons linking
| these two variables into a necessarily inevitable trade-
| off. And it's not only creative writing that's been
| hobbled; GPT3 used to be an _incredibly promising_
| academic research tool and given the right approach to
| prompts could uncover disparate connections between
| siloed fields that conventional search can only dream of.
|
| I'm eager for OpenAi to wake up and walk back on the
| clumsy corporate censorship, and/or for competitors to
| replicate the approach and improve upon the original
| magic without the "bias" obsession tacked on. Real
| challenge though "bias" may pose in some scenarios,
| perhaps a better way to address this would be at the
| training data stage rather than clumsily gluing on an
| opaque approach towards poorly implemented, idealist
| censorship lacking in depth (and perhaps arguably, also
| lacking sincerity).
| arecurrence wrote:
| I suspect you simply need to use it more with a lot more
| variation in your prompts. In particular, it takes style
| direction and some other modifiers to really get rolling. Run
| at least a few hundred prompts with this in mind. Most will be
| awful output... but many will be absolute gems.
|
| It has, honestly, completely blown me away beyond my wildest
| imagination of where this technology would be at today.
| [deleted]
| dereg wrote:
| I felt the same way. If anything, I realized how soulless and
| uninteresting faceless art is. Dall-E 2 goes out of its way to
| make terrible faces for, im guessing, deepfake reasons?
| [deleted]
| choppaface wrote:
| A free alternative:
|
| https://huggingface.co/spaces/dalle-mini/dalle-mini
|
| Reminder that the OpenAI team claimed safety issues about
| releasing the weights. Now they're charging, when the above link
| GPU time is being paid for by investor dollars. I guess sama must
| be hurting if he can only afford OpenAI credit packs for
| celebrities and his friends.
| softwaredoug wrote:
| Surprised by the lack of comments on the ethics of DALL-E being
| trained on artists content whereas copilot threads are chock full
| of devs up in arms over models trained on open source code. Isn't
| it the same thing?
| MWil wrote:
| I've been on the waitlist since April 16th. Would have loved to
| have played around with the alpha but now clearly my ability to
| experiment and learn to use the system to cut down on expenses is
| extremely limited.
| O__________O wrote:
| Two questions:
|
| (1) Any opinions on if removing the watermark is possible? Is
| doing so against the terms of service?
|
| (2) Appears the output is still at 1024x1024 - what are options
| to upscale the resolution, for example would OpenCV super
| resolution work?
| jeanlucas wrote:
| It is possible, they confirmed on discord you can remove the
| watermark.
|
| Yep... The output is an issue, I'd like to pay if that was an
| upgrade.
| O__________O wrote:
| Annoying that if removing the watermark is allowed that it is
| even inserted. Imagine if Adobe did that.
|
| Here's more information on super resolution options beyond
| what Adobe already offers:
|
| (1) List of options current options for super resolutions:
|
| https://upscale.wiki/wiki/Different_Neural_Networks
|
| (2) Older example of one way to benchmark:
|
| https://docs.opencv.org/4.x/dc/d69/tutorial_dnn_superres_ben.
| ..
| moron4hire wrote:
| How do you interface with DALL-E?
|
| For MidJourney I was painfully surprised to find that everything
| is done through chat messages on a Discord server.
|
| I'm not a paid member, so I have to enter my prompts in public
| channels. It's extremely easy to lose your own prompts in the
| rapidly flowing stream of prompts going on. I can kind of see why
| they did it that way--maybe, if I squint really hard--to try to
| promote visibility and community interaction, but it's just not
| happening. It's hard enough to find my own images, say nothing
| about follow what someone else is doing. This is literally the
| worst user experience I have ever had with a piece of software.
|
| There are dozens of channels. It's so spammy, doing it through
| Discord. It's constantly pinging new notifications and I have to
| go through and manually mute each and every one of the channels.
| Then they open a few dozen more. Rinse. Repeat.
|
| I understand paid users can have their own channels to generate
| images, but I really don't see the point in paying for it when,
| even subtracting the firehose of prompts and images, it's still
| an objectively shitty interface to have to do everything through
| Discord chat messages.
| neya wrote:
| I'm curious to know - does the community have any open source
| alternatives to DALL.E? For an initiative named OpenAI, keeping
| their source code and models closed behind a license is bullshit
| in my opinion.
| gwern wrote:
| EAI/Emad/et al's 'Stable Diffusion' model will be coming out in
| the next month or so. I don't know if it will hit DALL-E 2
| level but a lot of people will be using it based on the during-
| training samples they've been releasing on Twitter.
| minimaxir wrote:
| The best open-source-but-actually-can-be-run-on-simple-infra
| analogous to DALL-E 2 is min-dalle:
| https://github.com/kuprel/min-dalle
| arecurrence wrote:
| LAION is working on open source alternatives. There's a lot of
| activity in their discord and they have amassed the necessary
| training data but I am uncertain as to whether they have
| obtained the funding needed to deliver fully trained models.
| Phil Wang created initial implementations of several papers
| including imagen and parti in his GitHub account. EG:
| https://github.com/lucidrains/DALLE2-pytorch
| draw_down wrote:
| selimnairb wrote:
| I like how everyone's face is rendered by DALL-E to look either
| like a still from a David Lynch film, or have teeth and hair
| coming out of weird places.
| pawelduda wrote:
| That's disappointing given up until this point you could have 50
| free uses per 24h. I expected it to get monetized eventually, but
| not so fast and drastically. Well, still had my fun and have to
| say the creations are so good it's often mind blowing there's an
| AI behind it.
| mysore wrote:
| they're a non-profit so the price is probably still dirt cheap
| ajafari1 wrote:
| Not correct. They have a for-profit entity now. That's why
| there is a huge incentive to monetize. Any for-profit
| investment gain is capped at 100x, with the rest required to
| go to their nonprofit. This commercialization is just as I
| predicted in my substack post 2 days ago that hit the front
| page of Hacker News: https://aifuture.substack.com/p/the-ai-
| battle-rages-on
| dougmwne wrote:
| Honestly, it is probably just that expensive to run. You can't
| expect someone to hand you free compute of significant value
| and directly charging for it is a lot better than other things
| they could do.
| bulbosaur123 wrote:
| hhmc wrote:
| So you actually _wanted_ images that perpetuate the biases of
| the world?
| Geonode wrote:
| Reducing bias means affecting the data, instead of letting
| the end user just choose an appropriate image generated by a
| clean data set.
| illwrks wrote:
| I thought the same thing but I think the commenter is making
| a joke, but I could be wrong.
|
| I think they are suggesting that things like this (neural
| nets etc) work using bias, and by removing "bias" the
| developers are making the product worse.
|
| It's a very sh!t comment if it's not a joke.
| aloisdg wrote:
| Just to be sure. Does "OC" here mean Original Comment?
| illwrks wrote:
| Typo, now fixed.
| minimaxir wrote:
| Unfortunately, the method OpenAI may be using to reduce bias
| (by adding words to the prompt unknown to the user) is a
| naive approach that can affect images unexpectedly and
| outside of the domain OpenAI intended:
| https://twitter.com/rzhang88/status/1549472829304741888
|
| I have also seeing some cases where the bias correction may
| not be working at all, so who knows. And it's why
| transparancy is important.
| CobrastanJorji wrote:
| What a fascinating hack. I mean, yeah, naive and simplistic
| and doesn't really do anything interesting with the model
| itself, but props to the person who was given the "make
| this more diverse" instruction and said "okay, what's the
| simplest thing that could possibly work? What if I just
| append some races and genders onto the end of the query
| string, would that mostly work?" and then it did! Was it a
| GOOD idea? Maybe not. But I appreciate the optimization.
| kmeisthax wrote:
| This sounds like something that could backfire _very badly_
| on certain prompts. "person eating a watermelon" for
| example.
| bulbosaur123 wrote:
| Yes, I did. I want it to show world as it is not as people
| want it to be.
| scifibestfi wrote:
| How do you remove bias as long as humans are in the loop?
| Aren't they just swapping one bias for their own?
| brycethornton wrote:
| I'm blown away by this:
|
| "Starting today, users get full usage rights to commercialize the
| images they create with DALL*E, including the right to reprint,
| sell, and merchandise. This includes images they generated during
| the research preview."
|
| I assumed this was going to be the sticking point for wider usage
| for a long time. They're now saying that you have full rights to
| sell Dall-E 2 creations?
| vlunkr wrote:
| Is the lesson here that these images are worth nothing so they
| lose nothing by giving them away?
| [deleted]
| nutanc wrote:
| And I just used it to create cover art for a book published in
| Amazon :)
|
| https://twitter.com/nutanc/status/1549798460290764801?s=20&t...
| pqdbr wrote:
| What was your prompt?
| nutanc wrote:
| "girl with a cap standing next to a shadow man having a
| speech bubble, digital art"
| pferdone wrote:
| Does DALL-E create different outputs for the same input? How
| does ownership work there?
| flatiron wrote:
| yes it will. it'll keep on augmenting the image until it
| recognizes it as the input
| minimaxir wrote:
| Not only that, but you can also upload an image (that doesn't
| depict a real person) and generate variations of it without
| providing a prompt.
| berberous wrote:
| I think they are reacting to competition. MidJourney is
| amazing, was easier to get into, gives you commercial rights,
| and frankly I found more fun to use and even better output in
| most instances.
| napier wrote:
| The only thing I don't like about MidJourney is the Discord
| based interface. I think I can grok why Dave chose this route
| as it bakes in an active community element and allows users
| to pick up prompt engineering techniques osmotically... but
| I'd prefer a clean DALL-E style app and cli / api access.
| berberous wrote:
| In case you don't know, you can at least PM the MidJourney
| bot so you have an uncluttered workspace.
|
| It's clearly personally preference, but I loathe Discord
| but love it for MidJourney. As you said, there's an
| interactive element where I see other people doing cool
| things and adapting part of their prompts and vice versa.
| It really is fun. And when you do it in a PM, you have all
| your efforts saved. DALL-E is pretty clunky in that you
| have to manually save an image or lose it once your history
| rolls off.
| napier wrote:
| Thanks. Yeah fair point; I haven't ponied up for a
| subscription yet so am still stuck in public channels and
| often find my generations get lost in the stream. Imagine
| you're right and having the PM option would change the
| experience drastically for the better albeit still within
| Discord's visually chaotic environment.
| davedx wrote:
| MidJourney seems a little less all-out commercial. The way
| everyone's creations are in giant open Discord channels is
| great too
| stoicjumbotron wrote:
| Really hope I get an invite for MidJourney soon. Been on the
| waitlist since March :(
| ozmodiar wrote:
| Midjourney is in open beta now. Just go to their site and
| you can get started right away. I got in and I wasn't even
| on their waiting list.
| pitzips wrote:
| Midjourney recently changed their terms of service and now
| the creators own the image and give a license back to
| Midjourney. Pretty cool.
| jaggs wrote:
| nightcafe.studio is also free and good. Very good.
| MatthiasPortzel wrote:
| MidJourney definitely struggles more with complex prompts
| from what I saw. If you like the output more, that's
| subjective, but I think DALL*E is the leader in the space by
| a wide margin.
| berberous wrote:
| I think both have strengths and weaknesses, but I don't
| disagree DALL-E in most instances is technically better at
| matching prompts. But I often enjoyed, artistically, the
| results of MidJourney more; it just felt fun to use and
| explore.
| skybrian wrote:
| Don't they both give you commercial rights now?
|
| I have access to both and they're good for different things.
| DALL-E seems somewhat more likely to know what you mean.
| Midjourney seems better for making interesting fantasy and
| science fiction environments.
|
| For comparison, I tried generating images of accordions.
| Midjourney doesn't really understand that an accordion has a
| bellows [1]. DALL-E manages to get the right shape much of
| the time, if you don't look too closely: [2], [3]. Neither of
| them knows the difference between piano and button
| accordions.
|
| Neither of them can draw a piano keyboard accurately, but
| DALL-E is closer if you don't look too hard. (The black notes
| aren't in alternating groups of two and three.)
|
| Neither of them understands text; text on a sign will be
| garbled. Google's Parti project can do this [4], but it's not
| available to the public.
|
| I expect DALL-E will have many people sign up for occasional
| usage, because if you don't use it for a few months, the free
| credits will build up. But Midjourney's pricing seems better
| if you use it every day?
|
| [1] https://www.reddit.com/r/Accordion/comments/uuwrbj/midjou
| rne...
|
| [2] https://www.reddit.com/r/Accordion/comments/vz9zxw/dalle_
| sor...
|
| [3] https://www.reddit.com/r/Accordion/comments/w0677q/accord
| ion...
|
| [4] https://parti.research.google/
| minimaxir wrote:
| Previously, OpenAI asserted they owned the generated images, so
| the new licensing is a shift in that aspect. GPT-3 also has a
| "you own the content" clause as well.
|
| Of course, that clause won't deter a third party from filing a
| lawsuit against you if you commercialize a generated image
| _too_ close to something realistic, as the copyrights of AI
| generated content still hasn 't been legally tested.
| LegitShady wrote:
| As far as I can tell they still own the images they just
| license your use of them commercially.
| pornel wrote:
| AFAIK only people can own copyright (the monkey selfie case
| tested this), and machine-generated outputs don't count as
| creative work (you can't write an algorithm that generates
| every permutation of notes and claim you own every song[1]),
| so DALL-E-generated images are most likely copyright-free. I
| presume OpenAI only relies on terms of service to dictate
| what users are allowed to do, but they can't own the images,
| and neither can their users.
|
| [1]: https://felixreda.eu/2021/07/github-copilot-is-not-
| infringin...
| TaylorAlexander wrote:
| The monkey selfie was not derived from millions of existing
| works, and that is the difference. If an artist has a well-
| known art style, and this algorithm was trained on it and
| can copy that style, would the artist have grounds to sue?
| I don't know.
| l33t2328 wrote:
| If I write a song am I not deriving it from the existing
| works I've been exposed to?
| TaylorAlexander wrote:
| Sure but if you just release a basic copy of a Taylor
| Swift song you will get sued to oblivion. So the law
| seems (IANAL) to care about how similar your work is to
| existing works. DALL-E does not seem capable of showing
| you the work that influenced a result, so users will have
| no idea if a result might be infringing. What this means
| to me is that with many users, some of the results would
| be legally infringing.
| Melting_Harps wrote:
| > If an artist has a well-known art style, and this
| algorithm was trained on it and can copy that style,
| would the artist have grounds to sue? I don't know.
|
| While nothing has been commercialized yet on the DALLE2
| subreddit, I know that it can do Dave Choe's work
| remarkably well. I also saw Alex Gray's work to be close,
| but not really identical either. It wasn't as intricate
| as his work is.
|
| It will be interesting if this takes off and you have a
| sort of Banksy effect take over where unless it's a
| physical piece of art it doesn't have much value and is
| only made all the better because of some sort polemic
| attached to it, eg Girl with balloon.
| lancesells wrote:
| I'm going to guess there's not going to be much value
| placed on anything out of DALLE for a long while. Digital
| art is typically worth much less than physical art and I
| would say these GAN images are going to worth less than
| digital art generated by human hand.
|
| There will be outliers of course but I would be shocked
| if there's much of a market in it for at least the
| present.
| napier wrote:
| When these tools can generate layered tiff/psd images,
| polygon meshes and automate UV packing; then we'll be
| talking.
| ZetaZero wrote:
| > If an artist has a well-known art style, and this
| algorithm was trained on it and can copy that style...
|
| A lawyer could argue that the algorithm is producing a
| derivative work of the copyrighted input.
| TaylorAlexander wrote:
| Right but if that work isn't significantly changed from
| the source, it could be ruled as infringement. DALL-E
| cannot tell the users (afaik) if a result is close to any
| source material.
| lbotos wrote:
| Well, music is not "pictures" but Marvin Gaye's family
| got 5 million because Blurred Lines sounds similar enough
| to a Marvin Gaye song (even though it was not a sample):
| https://en.wikipedia.org/wiki/Pharrell_Williams_v._Bridge
| por...
| [deleted]
| ChadNauseam wrote:
| Even if you imitate someone's style intentionally, they
| don't have grounds to sue. Style isn't copyrightable in
| the US. Whether DALL-E outputs are a derivative work is a
| different question, though
| fanzhang wrote:
| If this were a concern, a user can easily bypass this by
| having a work-for-hire person add a minor transform layer
| on top of the DALL-E generated images right?
| JacobThreeThree wrote:
| Wouldn't it have to meet the threshold of being a
| "transformative" work?
|
| https://en.wikipedia.org/wiki/Transformative_use
| leereeves wrote:
| > DALL-E-generated images are most likely copyright-free
|
| The US Copyright Office did make a ruling that might
| suggest that recently[1], but crucially, in that case, the
| AI "didn't include an element of human authorship." The
| board might rule differently about DALL-E because the
| prompts do provide an opportunity for human creativity.
|
| And there's another important caveat that the felixreda.eu
| link seems to miss. DALL-E output, whether or not it's
| protected by copyright, can certainly _infringe_ other
| copyrights, just like the output of any other mechanical
| process. In short, Disney can still sue if you distribute
| DALL-E generated images of Marvel characters.
|
| 1: https://www.theverge.com/2022/2/21/22944335/us-
| copyright-off...
| totetsu wrote:
| Can I infringe another Dalle users rights if I take an
| image generated by their acount and sell prints of it..?
| unnah wrote:
| DALL-E can generate recognizable pictures of Homer Simpson,
| Batman and other commercial properties. Such images could
| easily be considered derivative works of the original
| copyrighted images that were used as training input. I'm
| sure there are plenty of corporate IP lawyers ready to
| argue the point at court.
| numpad0 wrote:
| I'm kind of surprised that no one had found "verbatim
| copy" cases as were made with GitHub Copilot. Such exact
| copies in photography are likely easier to go for than
| with code snippets.
| Nition wrote:
| It might be interesting to find an image in the training
| set with a long, very unique description, and try that
| exact same description as input in DALL*E 2.
|
| Of course it's unlikely to produce the exact same image,
| or if it does, you've also discovered an incredible image
| compression algorithm.
| obert wrote:
| they still own the generated content, only grant usage. I
| have mixed feelings about this confused approach, it won't
| last long.
|
| > ...you own your Prompts and Uploads, and you agree that
| OpenAI owns all Generations...
| mensetmanusman wrote:
| Image generating artificial intelligence is very analogous to
| a camera.
|
| Both technologies have billions of dollars of R&D and tens of
| thousands of engineers behind supply chains necessary to
| create the button that a user has the press.
| minimaxir wrote:
| There have been decades of litigation around when/where/of
| whom you can take a photo. AI generated art isn't there.
| mensetmanusman wrote:
| They will benefit by getting additional feedback on which
| output images are most useful.
| minimaxir wrote:
| DALL-E 2 has a "Save" feature which is likely a data
| gathering mechanism for this use case.
| Melting_Harps wrote:
| > "Starting today, users get full usage rights to commercialize
| the images they create with DALL*E, including the right to
| reprint, sell, and merchandise. This includes images they
| generated during the research preview."
|
| >> And I just used it to create cover art for a book published
| in Amazon :)
|
| Man... what a missed opportunity for Altman... he could have
| had a really good cryptocurrency/token with a healthy ecosystem
| and a creative based community if he didn't push this Worldcoin
| biometric harvesting BS had he just waited for this to release
| and coupled it with access to GPT.
|
| This is the kind of thing that Web3 (a joke) was pushing for
| all along: revolutionary tech that the everyday person can
| understand with it's own token based ecosystem for access with
| full creative rights from the prompts.
|
| I wonder if he stepped down from Open AI and put it in a
| figurehead as CEO could this still work?
|
| > Why is using a token better than using money, in this case?
|
| It would be better for OpenAI if it can monetize not just its
| subscription based model via a token to pay for overhead and
| for further R/D but also for it's ability to issue tokens it
| can freely exchange for utility on it's platform for exclusive
| access outside of it's capped $15 model and allow for pay as
| you go models for those who don't have access to it like myself
| as it's limited to 1 million users.
|
| I don't want an account, and I think that type of gatekeeping
| wasn't cool during the gmail days either and I had early access
| back then too, but I'd still personally buy $100s of dollars
| worth of prompts right now since I think it is fascinating use
| of NLP and I'm just one of many missed opportunities and
| represent a lost userbase who just want access for specific
| projects. By doing this they can still retain the caps of
| useage on their platform and expand and contract them as they
| see fit without excluding others.
|
| This in turn could justify the continual investment from the VC
| World into these projects (under the guise of web3) and allow
| them to scale into viable businesses and further expand the use
| of AI/ML into other creative spaces, which as a person studying
| AI and ML and a background in BTC, is what we all wanted to see
| instead of these aimless bubbles in things like Solana or yield
| farming via fake DeFi projects like Calesius that we've seen.
|
| It would legitimize the use of a token for use of an ecosystem
| model outside of BTC, which to be honest doesn't really exist
| and has still a tarnished view with all these failed projects,
| while gaining reception amongst a greater audience since it's
| captivated so many since it's release.
| pliny wrote:
| Why is using a token better than using money, in this case?
| mod wrote:
| I assume something to do with proving ownership via NFT.
| rvz wrote:
| It also means there will possibly be another renaissance of
| fully automated, mass generated NFTs and tons of derivatives
| and remixes flooding the NFT market in an attempt to pump the
| NFT hype again.
|
| It doesn't matter, OpenAI wins anyway as these companies will
| pour hundreds of thousands into generated images.
|
| It seems that the NFT grift is about to be rebooted again, such
| that it isn't going to die _that_ quickly. But still,
| eventually 90% of these JPEG NFTs will die anyway.
| WalterSear wrote:
| NFTs were never limited by artwork availability - they are
| limited by wash-trading ability.
| rvz wrote:
| These high photorealistic images can be generated on a
| mass-scale, completely automated without a human which
| ultimately cuts the need for an artist to do that.
|
| They will be replaced by DALL*E 2 for creating these
| illustrations, book covers, NFT variants, etc opening up
| the whole arena to anyone to do this themselves. All it
| takes is to _describe what they want in text_ and less than
| a minute, the work is delivered as little as $15.
|
| OpenAI still wins either way. If a crypto company goes to
| using DALL*E 2 to generate photorealistic NFTs, they won't
| stop them and they will take the money.
| WalterSear wrote:
| I'm not sure I understand the point you are trying to
| make.
|
| Art is already dirt cheap. People aren't buying NFTs for
| their content. This doesn't make it appreciably easier to
| con rubes.
| bilsbie wrote:
| Every tech should do this. Could google maps silently change
| your designation to a minority owned alternative?
| [deleted]
| peteforde wrote:
| I have been having a blast with DALL-E, spending about an hour a
| day trying out wild combinations and cracking my friends up. I
| cannot imagine getting bored of it; it's like getting bored with
| visual stimulus, or art in general.
|
| In fact, I've been glad to have a 50/day limit, because it helps
| me contain my hyperfocus instincts.
|
| The information about new pricing is, to me as someone just
| enjoying making crazy imagines, a huge drag. It means that to do
| the same 50/day I'd be spending $300/month.
|
| OpenAI: introduce a $20/month non-commercial plan for 50/day, and
| I'll be at the front of the line.
| jnovek wrote:
| My heart sank when I saw the pricing model.
|
| I've been creating generative art since 2016 and I've been
| anxiously waiting for my invite. I wont be able to afford to
| generate the volume of images it takes to get good ones at this
| price point.
|
| I can afford $20/mo for something like this but I just can't
| swing $200 to $300 it realistically takes to get interesting
| art out of these CLIP-centric models.
|
| Heck, the initial 50 images isn't even enough to get the hang
| of how the model behaves.
| blueboo wrote:
| If you're technically inclined, I urge you to explore some
| newer Colabs being shared in this space. They offer vastly
| more configurable tools, work great for free on Google Colab,
| are straightforward to run on a local machine.
|
| Meanwhile we should prepare ourselves for a future where the
| best generative models cost a lot more as these companies
| slice and dice the (huge) burgeoning market here.
| pkaye wrote:
| I'm sure the prices will go down each year as the computing
| costs go down.
| wongarsu wrote:
| MidJourney is a good alternative. Maybe not quite as good as
| DALL-E, but close enough, without a waitlist and with hobby-
| friendly prices ($10/month for 200 images/month, or $30 for
| unlimited)
| commandlinefan wrote:
| > trying out wild combinations and cracking my friends up
|
| Wait until the next edition comes out where it automatically
| learns the sorts of things that crack you up and starts
| generating them without any input from you.
| Filligree wrote:
| MidJourney gives ~unlimited generation for $30/month, and is
| nearly as good. Unlike DALL-E it doesn't deliberately nerf face
| generation. I've been having a blast.
| irrational wrote:
| Sounds kind of like scribblenauts. I would try the craziest
| things to see what it could come up with.
| dave_sullivan wrote:
| I think people don't realize how huge these models really are.
|
| When they're free, it's pretty cool. But charge an amount where
| there's actual profit in the product? Suddenly seems very
| expensive and not economically viable for a lot of use cases.
|
| We are still in the "you need a supercomputer" phase of these
| models for now. Something like DALLE mini is much more
| accessible but the results aren't good enough. Early early
| days.
| TigeriusKirk wrote:
| What _are_ the resources at work here?
|
| What are the resources needed to train this model?
|
| If someone just gave you the model for free, what resources
| would you need to use it to generate new results?
| dplavery92 wrote:
| In the unCLIP/DALL-E 2 paper[0], they train the
| encoder/decoder with 650M/250M images respectively. The
| decoder alone has 3.5B parameters, and the combined priors
| with the encoder/decoder are the in the neighborhood of ~6B
| parameters. This is large, but small compared to the name-
| brand "large language models" (GPT3 et. al.)
|
| This means the parameters of the trained model fit in
| something like 7GB (decoder only, half-precision floats) to
| 24GB (full model, full-precision). To actually run the
| model, you will need to store those parameters, as well as
| the activations for each parameter on each image you are
| running, in (video) memory. To run the full model on device
| at inference time (rather than r/w to host between each
| stage of the model) you would probably want an enterprise
| cloud/data-center GPU like an NVIDIA A100, especially if
| running batches of more than one image.
|
| The training set size is ~97TB of imagery. I don't think
| they've shared exactly how long the model trained for, but
| the original CLIP dataset announcement used some benchmark
| GPU training tasks that were 16 GPU-days each. If I were to
| WAG the training time for their commercial DALL-E 2 model,
| it'd probably be a couple of weeks of training distributed
| across a couple hundred GPUs. For better insight into what
| it takes to train (the different stages/components of) a
| comparable model, you can look through an open-source
| effort to replicate DALL-E 2.[2]
|
| [0] https://cdn.openai.com/papers/dall-e-2.pdf [1]
| https://openai.com/blog/clip/ [2]
| https://github.com/lucidrains/dalle2-pytorch
| peteforde wrote:
| Thanks for the really excellent insight and links.
|
| I do hope that the conversation starts to acknowledge the
| difference between sunk costs and running costs.
|
| Employees, office leases and equiment are all happening,
| regardless and ongoing.
|
| Training DALL-E 2: very expensive, but done now. A sunk
| cost where every dollar coming in makes the whole
| endeavor more profitable.
|
| Operating the trained model: still expensive, but you can
| chart out exactly how expensive by factoring in hardware
| and electricity.
|
| I believe that by not explicitly separating these
| different columns when discussing expense vs profit,
| we're making it harder than it needs to be to reason
| about what it actually costs every time someone clicks
| Generate.
| woojoo666 wrote:
| > This means the parameters of the trained model fit in
| something like 7GB (decoder only, half-precision floats)
| to 24GB (full model, full-precision)
|
| > you would probably want an enterprise cloud/data-center
| GPU like an NVIDIA A100, especially if running batches of
| more than one image.
|
| That doesn't seem so bad.
|
| _looks up price of NVIDIA A100 - $20,000_
|
| oh...ok I'll probably just pay for the service then
| fennecfoxen wrote:
| p4d.24xlarge is only $33/hr! And you get 400 Gbe so it
| should be quick to load.
| binarymax wrote:
| If I had to guess, based on other large models, it's in the
| range of hundreds of GBs. It might even be in the TB range.
| To host that model for fast production SaaS inference
| requires many GPUs. An A100 has 80GB, so a dozen A100s just
| to keep it in memory, and more if that doesn't meet the
| request demand.
|
| Training requires even more GPUs, and I wouldn't be
| surprised if they used more than 100 and trained over 3
| months.
| judge2020 wrote:
| > Training requires even more GPUs, and I wouldn't be
| surprised if they used more than 100 and trained over 3
| months.
|
| Based on this blog post where they scale to 7,500
| 'nodes', they say:
|
| > A large machine learning job spans many nodes and runs
| most efficiently when it has access to all of the
| hardware resources on each node.
|
| So I wouldn't be surprised if they do have a total of
| 7500+ GPUs to balance workloads between. TO add, OpenAI
| has a long history of getting unlimited access to
| Google's clusters of GPUs (nowadays they pay for it,
| though). When they were training 'OpenAI Five' to play
| Dota 2 at the highest level, they were using 256 P100
| GPUs on GCP[0] and they casually threw 256 GPUs at 'clip'
| for a short while in January of 2021[1].
|
| As for how they do it, see these posts:
|
| https://openai.com/blog/techniques-for-training-large-
| neural...
|
| https://openai.com/blog/triton/
|
| 0: https://openai.com/blog/openai-five/
|
| 1: https://openai.com/blog/clip/
| dave_sullivan wrote:
| Facebook released over 100 pages of notes a few months ago
| detailing their training process for a model that is
| similar in size. Does anyone have a link? I can't seem to
| find it in my notes, googling links to posts that have been
| removed or are behind the facebook walled garden.
|
| But I seem to remember they were running 1,000+ 32gb GPUs
| for 3 months to train it and keeping that infrastructure
| running day-to-day and tweaking parameters as training
| continued was the bulk of the 100 pages. It is beyond the
| reach of anybody but a really big company, at least in the
| area of very large models, and the large models are where
| all the recent results are. I wish I was more bullish on
| algorithm improvements meaning you can get better results
| on less hardware; there will definitely be some algorithm
| improvements, but I think we might really need more
| powerful hardware too. Or pooled resources. Something.
| These models are huge.
| ninjaranter wrote:
| > Facebook released over 100 pages of notes a few months
| ago detailing their training process for a model that is
| similar in size. Does anyone have a link?
|
| Is https://github.com/facebookresearch/metaseq/blob/main/
| projec... what you're referring to?
| dave_sullivan wrote:
| Yes! Thank you! Very good read for anyone interested in
| the field.
| Ajedi32 wrote:
| Training is obviously very expensive, and ideally they'd
| want to recoup that investment. But I'm curious as to what
| the marginal cost is to run the model after it's trained.
| Is it close to 30 images per dollar, like what they're
| charging now? Or do training costs make up the majority of
| that price?
| sinenomine wrote:
| > I think people don't realize how huge these models really
| are.
|
| They really aren't that large by the contemporary _scaling
| race_ standards. DALLE-2 has 3.5B parameters, which should
| fit on an old GPU like Nvidia RTX2080, especially if you
| optimize your model for inference [1][2] which is commonly
| done by ML engineers to minimize costs. With optimized model,
| your memory footprint is ~1 byte per parameter, and some less
| than 1 ratio (commonly ~0.2) of all parameters to store
| intermediate activations.
|
| You should be able to run it on Apple M1/M2 with 16GB RAM via
| CoreML pretty fine, if an order of magnitude slower than on
| an A100.
|
| Training isn't unreasonably costly as well: you can train a
| model given O(100k)$ which is less than a yearly salary of a
| mid-tier developer in silicon valley.
|
| There is no reason these models shouldn't be trained
| cooperatively and run locally on our own machines. If someone
| is interested in cooperating with me on such a project, my
| email is in the profile.
|
| 1. https://arxiv.org/abs/2206.01861
|
| 2. https://pytorch.org/blog/introduction-to-quantization-on-
| pyt...
| andreyk wrote:
| Check out Artbreeder, it is likewise a ton of fun!
|
| Multimodal.art (https://multimodal.art/) is working on a free
| version of something like DALLE, though it's not that good as
| of yet.
| nsxwolf wrote:
| I'm already bored of it. When you have everything, you have
| nothing.
| [deleted]
| peteforde wrote:
| I don't know how to say this without sounding like a jerk,
| even if I bend over backwards to preface that this isn't my
| intent: this statement says more about your creativity and
| curiosity than a ceiling on how entertaining DALL-E can be to
| someone who could keep multiple instances busy, like grandma
| playing nine bingo cards at once.
|
| Knowing that it will only get better - animation cannot be
| far behind - makes me feel genuinely excited to be alive.
| nwienert wrote:
| Dall-e has novelty, but no intent, meaning, originality.
| Yes the author can be creative at generating prompts, but
| visually I haven't seen it generate anything that feels
| artistically interesting. If you want pre-existing concepts
| in novel combinations then yes it works.
|
| It's good at "in the style of" but there's no "in a new
| style".
|
| It has a house style too that tends to feel Reddit-like.
| gsk22 wrote:
| Isn't every "new style" just a novel combination of pre-
| existing concepts? Nothing new under the sun and all
| that.
|
| Either way, I feel like your view is an exhaustingly
| pessimistic take on AI-generated art. I mean, sure, most
| of what DALL-E generates is pretty mundane, but other
| times I have been surprised at how bizarre and unique
| certain images are.
|
| You seem to imply that because an AI is not human, its
| art is not imbued with meaning or originality -- but I
| find that an AI's non-human nature is precisely what
| _makes_ the art so original and meaningful.
| hansword wrote:
| I would say it helps to first think what you want to get
| out of it.
|
| If your task is "show me something that breaks through
| our hyperspeed media", then I guess some obscure museum
| is a better place than an ML model.
|
| If your task is "find the best variation on theme X" or
| "quick draft visualization", they are often very useful.
| I am sure there will be many further tasks to which
| current and future models will be well suited. They are
| not magic picture machines. At least not yet.
| danielvaughn wrote:
| I'm sure the novelty wears off. But I'm already coming up
| with several applications for it.
|
| On the personal side, I've been getting into game
| development, but the biggest roadblock is creating concept
| art. I'm an artist but it takes a huge amount of time to get
| the ideas on paper. Using DALLE will be a massive benefit and
| will let me expedite that process.
|
| It's important to note that this is _not_ replacing my entire
| creative process. But it solves the issue I have, where I 'm
| lying in bed imagining a scene in my mind, but don't have the
| time or energy to sketch it out myself.
| ausbah wrote:
| >I'm an artist but it takes a huge amount of time to get
| the ideas on paper.
|
| this is what I really like about DALLE-mini, it's ability
| to create pretty good basic outlines for a scene. it's low
| resolution enough that there's room for your own creativity
| while giving you a good template to spring off from. things
| like poses, composition of multiple people, etc.
| zanderwohl wrote:
| I've used AI to try out different composition/layout
| possibilities. Sometimes it comes up with an arrangement
| of objects I hadn't considered. Sometimes it uses colors
| in really interesting ways. Great jumping-off point for
| drafting.
| nsxwolf wrote:
| I did notice it is very good at making small pixel art
| icons/sprites.
| jnovek wrote:
| I've been using generative models as an art form in and of
| themselves since the mid/late 2010s. I like generating
| mundane things that bump right up along the edge of the
| uncanny valley and finding categories of images that
| challenge the model (e.g. for CLIP, phrases that have a
| clear meaning but are infrequently annotated).
|
| Generating itself can be art. I'm not going to win a
| Pulitzer here, it's for the personal joy of it, but I will
| certainly never get tired of it.
| thatguy0900 wrote:
| I've been having a blast using it in my dungeons and
| dragons games. If you type in, say, "dnd village battlemap"
| it's really pretty usable. Not to mention the wild magic
| weapons and monsters it can come up with.
| bfgoodrich wrote:
| zone411 wrote:
| I have some first-hand experience about how the copyright office
| views these works from creating an AI assistant to help me write
| these melodies:
| https://www.youtube.com/playlist?list=PLoCzMRqh5SkFwkumE578Y....
| Here is a quote from the response from the Copyright Office email
| before I provided additional information about how they were
| created:
|
| "To be copyrightable, a work must be fixed in a tangible form,
| must be of human origin, and must contain a minimal degree of
| creative expression"
|
| So some employees there are aware of the impact that AI can have.
| Getting these DALL-E images copyrighted won't be trivial. I think
| it will be many years before the law is clarified.
| rvz wrote:
| > Starting today, users get full usage rights to commercialize
| the images they create with DALL*E, including the right to
| reprint, sell, and merchandise. This includes images they
| generated during the research preview.
|
| So DALL*E 2 is going to restart, revive and cause another
| renaissance of fully automated and mass generated NFTs, full of
| derivatives and remixing etc to pump up the crypto NFT hype
| squad?
|
| Either way, OpenAI wins again as these crypto companies are going
| to pour tens of thousands of generated images to pump their NFT
| griftopia off of life support, reconfirming that it isn't going
| to die that easily.
|
| Regardless of this possible revival attempt, 90% of these JPEG
| NFTs will _eventually_ still die.
| randomperson_24 wrote:
| I have tried playing around with the beta access to make it
| generate NFT art with different prompts, but in vail.
|
| I think it has not been trained on NFT art (crypto punks and so
| on).
| Melting_Harps wrote:
| > I think it has not been trained on NFT art (crypto punks
| and so on).
|
| How exactly are you defining NFT art?
|
| I mean, it can literately be anything: Dorsey sold a
| screencap of his 1st tweet, Nadya from Pussy Riot did some
| creative stuff, and the Ape crap was the bulk of this stuff
| that got passed around.
|
| I think what can be gleaned from that short-lived non-sense
| is that value is subjective and that the quality of a valuabe
| piece of 'art' is equally as hard to define. Much the same
| with its predecessor: cryptokitties.
| SquareWheel wrote:
| Heads up: I think you meant "in vain" rather than "in vail".
| However, a similar phrase is "to no avail" which also means
| that something was not successful.
| ask_b123 wrote:
| I think you meant "in vain" rather than "in vein".
| SquareWheel wrote:
| I sure did! Thank you, I've corrected that now.
| Bud wrote:
| I don't see why there's any credible reason to expect that
| DALL-E will do anything at all to help those promoting the NFT
| silliness. Two separate issues.
| mdanger007 wrote:
| If OpenAI could make a profit selling Dall-E images as NFT,
| I'd assume they'd do it, yeah?
| Melting_Harps wrote:
| Altman tried his hand at that by launching Worldcoin, and
| it didn't go well at all.
|
| So I think it's prudent that OpenAI keep the 'sell shovels'
| business model instead with DALLE and GPT, at least for the
| time being.
| nbzso wrote:
| Anything to end Corporate Memphis. Even, if we as illustrators
| will not have jobs or commissions. Let's hope that every creative
| human endeavour, painting, music, poetry will be replaced and
| removed from the commercial realm. Then maybe we will see
| artistic humanism instead of synthetic trans-humanistic "pop
| art".
|
| Happily for me I stopped painting digitally long time ago. I even
| stopped calling myself "an artist". Nowadays I paint and draw
| only with real medium and call all of that "Archivist
| craftsmanship with analogue medium". :)
| cm2012 wrote:
| I really want access, wish there was a way to pay to get in.
| TekMol wrote:
| I wonder how fast they will invite the 1 million users?
|
| I have been on the waitlist for a while and did not get access
| yet.
|
| Did anybody get access already today?
| deviner wrote:
| nope, I've been for quite some time too
| Sohcahtoa82 wrote:
| The name "OpenAI" to me implies being open-source.
|
| I have an RTX 3080 and will likely be buying a 4090 when it comes
| out. Will I ever be able to generate these images locally, rather
| than having to use a paid service? I've done it with DALL-E Mini,
| but the images from that don't hold a candle to what DALL-E 2
| produces.
| whywhywhywhy wrote:
| Their choice of name gets funnier every month.
| ronsor wrote:
| I'm not sure if any current or next-generation GPU even has
| enough power to run DALL-E 2 locally.
|
| Anyway, OpenAI is unlikely to release the model. The situation
| will like it is with GPT-3; however, it's also likely another
| team will attempt to duplicate OpenAI's work.
| jazzyjackson wrote:
| From what i've seen it's all about the VRAM
|
| if you've got 60GB available to your GPU then maybe you can get
| close
|
| I'm really curious if Apple's unified memory architecture is of
| benefit here, especially a few years from now if we can start
| getting 128/256GB of shared RAM on the SoC
| ajafari1 wrote:
| I wrote about this happening two days ago on my sub stack post,
| "OpenAI will start charging businesses for images based on how
| many images they request. Just like Amazon Web Services charges
| businesses for usage across storage, computing, etc. Imagine a
| simple webpage where OpenAI will list out their AI-job suite,
| including "jobs" such as software developer, graphics designer,
| customer support rep, and accountant. You can select which
| service offerings you'd like to purchase ad-hoc or opt into the
| full AI-job suite."
|
| In case you are interested in reading the whole take:
| https://aifuture.substack.com/p/the-ai-battle-rages-on
| arrow7000 wrote:
| "Business monetises their offering" can't say I'm entirely
| blown away by the prediction
| ukzuck wrote:
| Can anyone invite me for DALL E!
| [deleted]
| naillo wrote:
| This news is funny since it doesn't actually change anything.
| It's still a waitlist that they're pushing out slowly (not an
| open beta). Nice way to stay in the news though.
| outsider7 wrote:
| Amazing stuff (really fun)... can it solve climate change ?
| bemmu wrote:
| I was supposed to be making a video game, but got a bit
| sidetracked when DALL*E came out and made this website on the
| side: http://dailywrong.com/ (yes I should get SSL).
|
| It's like The Onion, but all the articles are made with GPT-3 and
| DALL*E. I start with an interesting DALL*E image, then describe
| it to GPT-3 and ask it for an Onion-like article on the topic.
| The results are surprisingly good.
| jelliclesfarm wrote:
| Love it! Better than other news I get to read these days. Some
| of it rings..like the bluebird suing the cat.
|
| Thank you! Bookmarked!
| picozeta wrote:
| These are actually quite funny. A bit of a surreal touch, but
| that makes them even more fun.
| tiborsaas wrote:
| Thanks, finally a legit news publication :)
|
| This was really funny :)
|
| http://dailywrong.com/man-finally-comfortable-just-holding-a...
| biztos wrote:
| So the other men in the pictures are the uncomfortable ones?
| zanderwohl wrote:
| Somehow these articles are more readable than typical AI-
| generated search engine fodder... Is it because I'm entering
| the site with an expectation of nonsense?
| slavak wrote:
| Probably because, by the creator's own admission, the
| articles are heavily cherry-picked to make sure the output
| is decent, which is probably a lot more human effort than
| goes into the aforementioned search engine fodder.
|
| http://dailywrong.com/sample-page/
| pwillia7 wrote:
| I would guess that most Spam farms are not using openAI
| davinci model which is really really good, but expensive.
| Just a guess.
| hanselot wrote:
| layer8 wrote:
| This one seems like it could actually be real in Japan:
| http://dailywrong.com/anime-pillow-gym-opens-in-tokyo/ ;)
| busyant wrote:
| This is clever. Does GPT-3 come up with the title of the
| article, too? That's the funniest part.
| bemmu wrote:
| At first I came up with them myself, but found that it often
| comes up with better ones, so I ask it for variations.
|
| I think I got it to even fill the title given a picture,
| something like "Article picture caption: Man holding an
| apple. Article title: ...". Might experiment more with that
| in the future.
| sillysaurusx wrote:
| How do you prompt GPT-3 to come up with the titles? That's
| an interesting problem.
| busyant wrote:
| Well, then I'm impressed with GPT-3's ability to generate
| those titles!
|
| The combination of photo/title feels like they come from
| the more absurd articles published by theonion.
|
| If we aren't living in a simulation, it's just a matter of
| time...
| lagrange77 wrote:
| http://dailywrong.com/new-course-teaches-guinea-pigs-househo...
|
| lol
| walrus01 wrote:
| The results with things that are artworks or more general
| concepts are fascinating, but there is for sure something
| creepy with "photorealistic" human eyes and faces going on...
|
| If you want to see some really creepy AI generated human
| "photo" faces, take a look at Bots of New York:
|
| https://www.facebook.com/botsofnewyork
| dntrkv wrote:
| Spam advertising is about to reach whole new levels of weird.
| ttyyzz wrote:
| NGL this shit is pretty cursed and I like it.
| benbristow wrote:
| From the server IP looks like you're on some managed WordPress
| hosting that only offers free SSL on the more 'premium'
| packages.
|
| Easiest way for free SSL would be to just throw the domain on
| CloudFlare :)
| pieter_mj wrote:
| Very funny! The "Scientists Warn New Faster Toothbrush May
| Cause Insanity"-story is not fake though, I've experienced it
| ;)
| stuaxo wrote:
| This is fantastic, the fake news the world needs.
| aantix wrote:
| Feels like the headlines could be generated similar to the
| style of "They Fight Crime!"
|
| "He's a hate-fuelled neurotic farmboy searching for his wife's
| true killer. She's a tortured insomniac snake charmer from a
| family of eight older brothers. They fight crime!"
|
| https://theyfightcrime.org/
|
| Here's an implementation in Perl.
|
| http://paulm.com/toys/fight_crime.pl.txt
| edm0nd wrote:
| lol that site is great
|
| >He's an unconventional gay paranormal investigator moving
| from town to town, helping folk in trouble. She's a violent
| motormouth wrestler from the wrong side of the tracks. They
| fight crime!
|
| >He's a Nobel prize-winning sweet-toothed rock star who
| believes he can never love again. She's a strong-willed
| communist widow with a knack for trouble. They fight crime!
|
| >He's an obese white trash barbarian with a secret. She's a
| virginal thirtysomething traffic cop with the power to bend
| men's minds. They fight crime!
| aasasd wrote:
| http://dailywrong.com/wp-content/uploads/2022/07/DALL%C2%B7E...
|
| Hot dang. Some Reddit subs can be auto-generated now.
| tildef wrote:
| Actually got a chuckle out of the duck one
| (http://dailywrong.com/man-finally-comfortable-just-
| holding-a...). Thanks! I hope your keep generating them. Kind
| of wish there weren't a newsletter nag, but on the other hand
| it adds to the realism. Could be worthwhile to generate the
| text of the nag with gpt too; call it a kind of lampshading.
| aantix wrote:
| Parenting > "Gillette Releases a New Razor for Babies"
| bemmu wrote:
| I loved how it just consistently decided that if babies have
| facial hair, it's always white fluff.
| lancesells wrote:
| I think it's because it's using images of babies with soap
| on their face to learn. Still funny though!
| uxamanda wrote:
| The part where you have to confirm you are not a robot to
| subscribe to the mailing list is the best part of this, my new
| favorite website.
| drusepth wrote:
| Haha, I was in a very similar boat when I built
| https://novelgens.com -- I was also supposed to be making a
| video game, but got a bit sidetracked with VQGAN+CLIP and other
| text/image generation models.
|
| Now I'm using that content _in_ the video game. I wonder if you
| could use these articles as some fake news in your game, too.
| :)
| astroalex wrote:
| This is amazing! Honestly one of the first uses of GPT3/DALL E
| that has held my attention for longer than a few seconds.
| mark_l_watson wrote:
| I tried DALLE once and liked the generated images. Not really my
| thing, but so cool.
|
| What I do use is OpenAI's GPT-3 APIs, I am a paying customer.
| Great tool!
| lagrange77 wrote:
| Has anyone else had problems with the 'Generate Variations'
| functions lately? Tried it out first 3 days ago, and it says
| 'Something went wrong. Please try again later, or contact
| support@openai.com if this is an ongoing problem.' everytime
| since then.
| Plough_Jogger wrote:
| Is this referring to the first version of the model, or DALL-E 2?
| nharada wrote:
| Something I haven't seen anyone talking about with these huge
| models: how do future models get trained when more content online
| is model generated to start with? Presumably you don't wanna
| train a model on autogenerated images or text, but you can't
| necessarily know which is which.
| cpach wrote:
| Makes me think of Ouroboros
|
| https://en.m.wikipedia.org/wiki/Ouroboros
| nharada wrote:
| Reminds me of https://en.wikipedia.org/wiki/Low-
| background_steel
| espadrine wrote:
| In this situation, the low-background steel is the MS-COCO
| dataset, associated with the Frechet inception distance
| computed by comparing the statistical divergence between
| the high-level vector outputs of passing MS-COCO images
| through Google's InceptionV3 classifier, and passing DALL-E
| images (or its competitors) through it.
|
| For now at least, there is a detectable difference in
| variety.
| zitterbewegung wrote:
| This should be a step in cleaning your data to begin with. If
| you don't know the providence of your data then you shouldn't
| be even training with it.
|
| Getting humans to refine your data is the best solution right
| now and many companies and researches go with this approach.
| jazzyjackson wrote:
| s/ide/ena
| Voloskaya wrote:
| > Getting humans to refine your data is the best solution
| right now
|
| Source ?
|
| All those big models are trained with data for which the
| source is not known or vetted. The amount of data needed is
| not human-refinable.
|
| For example for language models we train mostly on subsets of
| CommonCrawl + other things. CommonCrawl data is "cleaned" by
| filtering out known bad sources and with some heuristics such
| as ratio of text to other content, length of sentences etc.
|
| The final result is a not too dirty but not clean huge pile
| of data that comes from millions of sources that no human as
| vetted and that no one in the team using the data knows
| about.
|
| The same applies to large images dataset, e.g. Laon 400m that
| also comes from CommonCrawl and is not curated.
| nharada wrote:
| But how would you know? A random string of text or an image
| with the watermark removed is going to be very hard to
| distinguish generated from human written.
| FrenchDevRemote wrote:
| You can't use humans to manually refine a dataset on the
| scale of GPT-3 or DALL-E
|
| Clip was trained on 400,000,000 images, GPT is roughly 180B
| tokens, at 1-2 tokens per word, that's 120,000,000,000 words.
| pshc wrote:
| At least cleaning it up is an embarrassingly parallel
| problem, so if you had the resources to throw incentives at
| millions of casual gamers, you might make a nice dent on
| Clip.
| zanderwohl wrote:
| Alternatively, making a captcha where half the data is
| unlabeled, and half is labeled, forcing users to
| categorize data for you as they log into accounts.
| Jleagle wrote:
| The images i have created all have a watermark.. This is at
| least one way to filter out most images, by the same AI at
| least.
| goolulusaurs wrote:
| It's a cybernetic feedback system. Dalle is used to create new
| images, the images that people find most interesting and
| noteworthy get shared online, and reincorporated into the
| training data, but now filtered through human desire.
| [deleted]
| can16358p wrote:
| I think with the terms requiring explicitly telling which
| images/parts were generated, they could be filtered out and
| prevent a feedback loop of "generated in/generated out" images.
| I'm sure there will be some illegal/against terms of use cases
| there but the majority should represent fair use.
| mikeyouse wrote:
| This precise thing is causing a funny problem in specialty
| areas. People are using e.g. Google Lens to identify plants,
| birds and insects, which sometimes returns wrong answers e.g.
| say it sees a picture of a Summer Tanager and calls it a
| Cardinal. If the people then post "Saw this Cardinal" and the
| model picks up that picture/post and incorporates it into its
| training set, it's just reinforcing the wrong identification..
| bobbylarrybobby wrote:
| https://xkcd.com/978/
| Pxtl wrote:
| Then that's a cardinal now.
| scarmig wrote:
| That's not really a new problem, though. At one point someone
| got some bad training data about an old Incan town, the
| misidentification spread, and nowadays we train new human
| models to call it Macchu Picchu.
| vanillaicesquad wrote:
| The difference between the name of an old Incan town and a
| modern time plant identification mistake is that maybe the
| plant is poisonous.
|
| Made with gpt3
| blfr wrote:
| Training on auto generated images collected off the Internet is
| gonna be fine for a while since the images surfacing will be
| curated (ie. selected as good/interesting/valuable) still
| mostly by humans.
| jmartrican wrote:
| I wonder if human artists can demand that their work not be
| used for modelling. So as the robots are stuck using older
| styles for their creations, the humans will keep creating new
| styles of art.
| naillo wrote:
| One interesting comment about this is that some models actually
| benefit from being fed their own output. Alphafold for instance
| was fed with its own 'high likelihood' outputs (as demis
| hassabis described in his lex friedman interview).
| gwern wrote:
| My discussion of this issue (which actually comes up in like
| every DALL-E 2 discussion on HN):
| https://www.lesswrong.com/posts/uKp6tBFStnsvrot5t/what-dall-...
| ccmcarey wrote:
| That's about 10x as expensive as it should be
| karencarits wrote:
| $15 for 115 iterations/460 images?
| ccmcarey wrote:
| Yep. During the alpha it was (50*6) 300 images per day, by
| their pricing model that would be $300 a month now
| WalterSear wrote:
| $15 for 115 _attempts_ to get usable images.
| kache_ wrote:
| Give it some time. Other organizations will race to the bottom.
|
| They might even provide image generation at a loss to drive
| people to their platforms.
| gverri wrote:
| $0.13/prompt can only be useful for artists/end users. Anyone
| thinking about using this at scale would need a 20/30x
| reduction in price. But there's still no API available so I
| think that will change with time. Maybe they will add different
| tiers based on volume.
| jeanlucas wrote:
| Thing is, as a current user: you rarely get it right in the
| first prompt, you can iterate 10 times until you get what you
| want.
|
| I spent several tries yesterday to get this angle "from the
| ground up":
| https://labs.openai.com/s/mz8LiyvkI8KwD2luJ6MrS23m
| dntrkv wrote:
| So $1.30 for getting a result that would have cost how much
| to pay someone to make? Not to mention the 59 other
| variations you would have.
| raisedbyninjas wrote:
| When there is a competitor, they can adjust pricing. For now,
| it's virtually magic.
| bradleybuda wrote:
| You should ship a competitor! Sounds like you found a great
| market opportunity.
| scifibestfi wrote:
| What are you basing that on? What should the price be? The
| training and generation are probably expensive.
| thorum wrote:
| Until you consider the level of demand for this product, which
| is surely higher than OpenAI can scale to with the number of
| GPUs they have. If they price it lower they'll be overwhelmed.
| Workaccount2 wrote:
| Welcome to SaaS.
| isoprophlex wrote:
| Wait until someone trains a model like this, for porn.
|
| There seems to be a post-DALLE obscenity detector on openAI's
| tool, as so far I've found it to be entirely robust against
| deliberate typos designed to avoid simple 'bad word lists'. Ask
| it for a "pruple violon" and you get purple violins... you get
| the deal.
|
| "Metastable" prompts that may or may not generate obscene
| (content with nudity, guns, violence as I've found) results
| sometimes shown non-obscene generations, and sometimes trigger a
| warning.
| jug wrote:
| I've thought about this and in fact porn generation sounds like
| a good thing?? It ensures that it's victimless. Of course,
| there is a problem with generation of illegal (underage) porn
| but other than this, I think it could be helpful for this
| world.
| jowday wrote:
| If I had to guess, I'd bet they have a supervised classifier
| trained to recognize bad content (violence, porn, etc) that
| they use to filter the generated images before passing them to
| the user, on top of the bad word lists.
| cmarschner wrote:
| Most likely they just take the one from bing. Or, if they
| trained a better one, it goes vice versa sooner or later
| isoprophlex wrote:
| Exactly!
| zionic wrote:
| Honestly that part pisses me off. Who cares if their AI "makes
| porn" or something "offensive".
| fishtoaster wrote:
| I suspect it's more a business restriction than a moral one.
| If OpenAI allows people to make porn with these tools, people
| will make a _ton_ of it. OpenAI will become known as "the
| company that makes the porn-generating AIs," not "the company
| that keeps pushing the boundaries of AI." Being known as the
| porn-ai company is bad for business, so they restrict it.
| alana314 wrote:
| I tried the term "cockeyed" and got a TOS violation notice
| [deleted]
| justinzollars wrote:
| I would love access to this in order to design Silver Rounds. If
| you work at open API please reach out!
| dharbin wrote:
| I find it amusing that they suggest DALL-E, which typically
| generates lovecraftian nightmare images, for making children's
| story illustrations.
| driverdan wrote:
| How so? If you give it prompts for children story illustrations
| with a detailed description it will not give you "lovecraftian
| nightmare images".
| throwaway0x7E6 wrote:
| yeah. dalle is "so bad it's good".
|
| it's great for post-post-ironic memes, but I don't see it being
| useful for anything else
| arkitaip wrote:
| No wireless. Less space than a nomad. Lame.
| andybak wrote:
| Have you tried any of the "human or Dall-E" tests?
|
| How did you score?
|
| I only scored as well as I did because I knew the kind of
| stylistic choices to look out for. In terms of "quality" I
| really don't understand how you've reached this conclusion.
| throwaway0x7E6 wrote:
| I've only seen this thing
| https://huggingface.co/spaces/dalle-mini/dalle-mini
|
| is it not dall-e?
| _flux wrote:
| It is not and that's why OpenAI asked them to change the
| name, which they did.
| throwaway0x7E6 wrote:
| oh. I retract my OP then
| andybak wrote:
| It's a reimplementation.
|
| It's a long way off in terms of quality (at the moment
| anyway)
| astrange wrote:
| It's a model inspired by DALLE 1 but it's not even very
| close to that.
|
| But it does seem to know a lot of things the real DALLE2
| doesn't.
| Nevin1901 wrote:
| I don't like how they're charging money for Dalle, yet they don't
| have an API available.
___________________________________________________________________
(page generated 2022-07-20 23:00 UTC)