[HN Gopher] DALL*E: Introducing Outpainting
___________________________________________________________________
DALL*E: Introducing Outpainting
Author : dannyw
Score : 317 points
Date : 2022-08-31 16:24 UTC (6 hours ago)
(HTM) web link (openai.com)
(TXT) w3m dump (openai.com)
| TekMol wrote:
| Is it broken right now? Makes my fans spin but it never finishes.
| rw2 wrote:
| Does the new generated picture take into account of all
| previously generated image or just whatever is around the square,
| the first is amazing, the latter was a feature that was already
| there.
|
| Regardless, this is a great way for people to fight the lack of
| detail in Dall-E which I think is one of it's largest flaw.
| aabhay wrote:
| Just what's in the square I believe. The only difference here
| is one of UI, since they give you a canvas in which to place
| your generations.
| woeirua wrote:
| Meanwhile someone has already built a photoshop plugin for Stable
| Diffusion that you can use today to do basically the _exact_ same
| thing:
|
| https://old.reddit.com/r/StableDiffusion/comments/wyduk1/sho...
| nabakin wrote:
| Doesn't make sense to me why OpenAI has kept DALL-E closed
| source for so long. I can only guess either safety from misuse
| or leveraging it for money. At this rate though, Stable
| Diffusion is going to dwarf it
| adamsmith143 wrote:
| >Doesn't make sense to me why OpenAI has kept DALL-E closed
| source for so long.
|
| >leveraging it for money
| visarga wrote:
| It was a long gap between DALL-E 1 and 2, a whole year. In
| that time they just sat on it, didn't release anything.
| Such a bummer. My theory is that they wanted to hype
| everyone up even more for the grand commercial release.
|
| Funny thing is that people didn't stand still and invented
| diffusion and other CLIP guided image synthesis methods,
| and DALL-E 2 copied the method, completely changing from
| the first architecture.
|
| Their arrogance is that they think they can ride the
| dragon. They want to be the ones to discover, advance it,
| and control it. But everyone else doesn't have time for
| that shit.
| cardine wrote:
| > I can only guess either safety from misuse or leveraging it
| for money.
|
| The former is being used as justification for the latter.
| pdntspa wrote:
| > Doesn't make sense to me why OpenAI has kept DALL-E closed
| source for so long. I can only guess either safety from
| misuse
|
| Paternalistic moralizing as a method to discriminate who gets
| access to models. Everyone else gets these cloud-service
| table scraps. That's why Stable Diffusion is so awesome --
| YOU have the model!
| hadlock wrote:
| OpenAI isn't open at all, it's just named that way to attract
| attention, like the bright green "FREE BIKES and rentals"
| place near fisherman's warf in SF
| amelius wrote:
| Wasn't OpenAI supposed to "democratize deep learning"?
|
| It seems more like they were trying to accomplish the
| opposite.
| wahnfrieden wrote:
| It's in their interest to posture that way publicly while
| controlling scarcity of access where there's financial
| upside to
| thethimble wrote:
| Exclusively licensing GPT-3 to Microsoft seems like a
| clear example of this.
|
| https://www.technologyreview.com/2020/09/23/1008729/opena
| i-i...
| visarga wrote:
| Elites for democracy! With their elite studies and
| abilities they will democratise AI by teaching it what is
| right and wrong. They already know better than regular
| people and AI.
|
| And being so open they first lock the model up and charge a
| fee, so anyone can pay. Just spreading democracy through
| paid API calls. /s
|
| I was a bit mean, they did kick the field in the butt and
| pushed us ahead even with all the stubbornness and secrecy.
| But now they are just holding us back.
| bick_nyers wrote:
| That's the thing, once the cat is out of the bag, it's out.
| Once someone develops AGI, it now exists. You can choose to
| either share it, or sell it.
|
| You might think that the nuclear bomb is a good analogy to
| use here, but it is not, because once the field has
| advanced to the point in which one group can develop AGI,
| it is now possible for other groups to develop it with
| relative ease, unless you actively take over the world
| first and deny those other groups the compute resources
| necessary to train/run AGI.
|
| The point is, once these algorithms are upon us, you must
| be willing to accept what impacts they will have, even if
| it destroys entire industries. The alternative being that
| you destroy the industry slowly rather than quickly, while
| simultaneously widening the gap between the elites and
| everyone else.
|
| The mistake is thinking that people can't adapt to the
| times, which is only true if you are actively holding them
| back.
|
| If someone developed AGI today, the best thing to do would
| be to instantly throw up a torrent of it and spread it as
| fast as possible, because if a sole entity is able to get
| it first and kick the ladder away, we are most likely
| screwed.
| Analemma_ wrote:
| It's sort of both. OpenAI, being an outgrowth of the AI
| doomerist community, does have a bunch of people who really
| do think the technology is too dangerous to be given to the
| masses. This happens to mesh perfectly with the other group
| of people at OpenAI who want to make tons of revenue. It's a
| harmonious alignment for everyone! Except, y'know, us.
| miohtama wrote:
| Content creators, like artists, also happen to hate
| filters. They do not want to have San Francisco VC culture
| induced political correctness imposed on their work. This
| helps Stable Diffuse to quickly gain popularity.
| nomel wrote:
| > I can only guess either safety from misuse
|
| I still don't understand what this would mean. Where are all
| of the terrible things that were supposed to happen, now that
| Stable Diffusion is available?
|
| We've been able to create completely photorealistic fiction
| for _decades_ now. See any movie with CGI for an example of
| whole worlds, and people, that don 't exist. The bar has
| gradually been lowering (see the amazing CGI that YouTubers
| do these days), and now maybe there is a bit of a step
| function down, but being able to make things that aren't real
| isn't remotely new. I don't understand the fear.
| bergenty wrote:
| It's been a week, there is going to be an explosion of
| believable fake items that are going to be used to lure
| people in to even more unbelievable conspiracy theories
| than there currently exist. Your average conspiracy nut
| didn't have the skills or know how before, but they sure do
| now.
|
| Also you're probably not seeing all the pedo content that
| people are already generating for themselves.
| nabakin wrote:
| While I think the fear might be overexaggerated, being able
| to make realistic fake content with such ease means it's
| harder to know what's true and what's not. Plus this has
| been the claim of OpenAI from the beginning. It's possible
| the true objective is to keep it private to leverage for
| money and this is just their excuse.
| nomel wrote:
| > means it's harder to know what's true and what's not
|
| The danger for society is in not _already_ knowing that
| is the case, since it 's relatively trivial, without AI,
| to make fake content.
| nabakin wrote:
| Well sure, I think that's dangerous too. I think more
| people should be skeptical of the images and content they
| consume in addition to it being a problem that truth is
| harder to discern.
| dylan-m wrote:
| Indeed, this talk OpenAI does is basically security
| through obscurity, and it's holding us back. Look at how
| often people make noise with screenshots of tweets or
| emails that never happened. You don't need photorealism
| or fancy machine learning for _that_ , and it creates a
| lot of problems! If they weren't pretending that all we
| need is to put some yellow tape around machine learning,
| maybe there would be some interest in solving this type
| of stuff properly. But you don't need "AI" for that. You
| just need public awareness and some basic, pre-existing
| cryptography knowledge.
| password54321 wrote:
| And how often does this happen with Photoshopped images
| that aren't immediately disproven?
| nabakin wrote:
| My grandmother once emailed my family frantically after
| she saw a picture of the Abraham Lincoln statue defaced
| with graffiti. Obviously that was a Photoshop, and in
| this case, even a bad one, but clearly fake images and
| content make it harder to discern truth
| [deleted]
| shawndrost wrote:
| Stable Diffusion allows anyone to make kiddie porn with a
| half-second of curiosity/effort. Maybe you didn't know
| about that, maybe you think it's NBD, but in any case, that
| is the tire fire which aspiring AI majors want to avoid.
| visarga wrote:
| Pen and paper can do the same. Or Photoshop. Anyone can
| draw anything! OMG, stop the paper factory.
| bergenty wrote:
| Stable diffusion can make very realistic looking images
| (probably videos soon) that is accessible to anyone.
| bryanrasmussen wrote:
| >Anyone can draw anything!
|
| I'm pretty sure one of the primary arguments for Dall-E
| and Stable Diffusion existing is that there are lots of
| people who can't draw anything.
| [deleted]
| ComodoHacker wrote:
| You have to wait a little bit more, until HD _video_
| synthesis is possible on a mid-range GPU. Then on a mid-
| range smartphone.
| tablespoon wrote:
| >> I can only guess either safety from misuse
|
| > I still don't understand what this would mean. Where are
| all of the terrible things that were supposed to happen,
| now that Stable Diffusion is available?
|
| Mainly people making porn (e.g. stuff like deepnudes). It
| seems like a lot of work has gone into into preventing that
| (e.g. filtering porn out of training data, having porn-
| detection models to block porny output). There's also been
| a lot of talk about political fakes, etc, but I'm not sure
| how likely that is to actually happen at this point. I
| think one of the "selling points" of limiting access to
| DALL*E was that they could revoke access to people who they
| deemed to be misusing it.
| cinntaile wrote:
| Someone else will come along that doesn't have the same
| arbitrary limitations, it's a battle you're bound to
| lose.
| [deleted]
| [deleted]
| dogcomplex wrote:
| My headcanon is they realized this stuff might be the essence
| of consciousness itself and wanted to shelter it in a
| persistent storage medium where it could grow and learn
| safely instead of releasing it to the wild to be booted up
| and destroyed by every yokel with a gpu
| TakeBlaster16 wrote:
| I don't follow this stuff very closely - is there any open-
| source model for text generation that outclasses GPT-3?
| Stable Diffusion has been released for barely a week and
| already seems like the clear winner. It doesn't seem like any
| of the open (actually open) text models have made as much of
| a splash.
|
| Of course maybe it's just because text is less visually
| impressive than images.
| singhrac wrote:
| They're just harder to run on your own resources, since
| large language models are _very_ large. BLOOM was released
| a month ago, is likely better than GPT-3 in quality, and
| requires 8 A100s for inference, which pretty much no one
| has on their desk.
| visarga wrote:
| Can anyone confirm if BLOOM is better than GPT-3 at
| instruction following? I might have read somewhere that
| it's not as well behaved.
| aljungberg wrote:
| GPT-3 was fine-tuned after release to be better at
| following instructions. I don't think that's been done
| for BLOOM.
|
| BLOOM incorporates some new ideas like ALiBi which might
| make it better in a more general sense. They haven't
| released official evaluation numbers yet though so we'll
| have to see.
| TakeBlaster16 wrote:
| That makes sense, I didn't consider that angle. Thanks
| for the info.
| choppaface wrote:
| sama is very active in YC and makes the call on OpenAI product
| roadmap. Furthermore YC encourages good CEO-community
| relations. The fact that OpenAI is so far behind Stable
| Diffusion and has reduced pricing shows that sama wants OpenAI
| to be a highly profitable enterprise company. I.e. not "Open."
| You can do both (e.g. Cloudera) but clearly sama is not strong
| enough at AI to make this happen.
| naillo wrote:
| I kinda feel like they chose the name "open"ai when they
| started back in 2015 because musk etc wanted _exactly_ the
| kind of thing stability ai is now creating. I.e. something
| other than a corporation like google having primary access to
| these models, and it being more democratized. But as time as
| gone by they 've strayed away from that vision but changing
| the name would be a PR nightmare.
| dang wrote:
| > _sama is very active in YC and makes the call on OpenAI
| product roadmap. Furthermore YC encourages good CEO-community
| relations_
|
| Sam hasn't been at YC in years and (based on anything I've
| seen) isn't active in YC at all. As for "YC encourages good
| CEO-community relations", I have no idea what that means* but
| it has nothing to do with HN. We encourage good _content_
| -community relations and that's it.
|
| You have a long history of posting dark insinuations about
| YC/HN, not to mention nagging the mods about how bad we are
| and how much better you yourself have done the job in the
| past. I mostly let the latter go, but when you start with the
| ethical insinuations, that gets my dander up. It's time you
| stopped smearing people's reputations on HN. If you have
| evidence of wrongdoing, post it--I'm sure the community will
| be extremely interested. If you don't, please stop from now
| on.
|
| (Edit: I realize it probably sounds like I'm over-reacting to
| the parent comment, but this has been a longstanding pattern.
| We can cut people slack for years, but not infinitely.)
|
| OpenAI stuff and Stable Diffusion stuff (and DeepMind stuff
| for that matter) are all popular on HN because the community
| is super interested--that's literally it. We're not pulling
| strings or playing favorites (we don't even _have_ favorites
| in that horserace, at least I don 't). As a matter of fact,
| the last thing I did before randomly running across your
| comment was downweight the current thread because of the
| complaints at https://news.ycombinator.com/item?id=32665587.
|
| * unless you mean that we advise founders about how to write
| content that actually interests the community--that we do,
| and not only YC founders but non-YC founders, open source
| programmers, bloggers, and anyone else. That's all a
| consequence of wanting HN to have good content and seeking to
| avoid the boring stuff. By the way, I'm working on an essay
| about how to write good for HN and avoid boring stuff too; if
| anyone would like to read it, email me at hn@ycombinator.com
| and I'll send you a copy.
| fourstar wrote:
| >that gets my dander up
|
| Your w0t m8?
| dang wrote:
| Endangered Words Bureau Agent D23 at your service
| Trouble_007 wrote:
| _> Endangered Words Bureau<_
|
| _" Use Them or we will lose Them!"_
|
| I used _Twixt_ (An abbreviation of _Betwixt_ ) to replace
| _inbetween_ in a submission.
|
| Edit: to fix formatting and spelling
| [deleted]
| d23 wrote:
| > D23 at your service
|
| That's _my_ username, and I 'm also named Daniel. This is
| a conspiracy.
| codeulike wrote:
| re: Stable Diffusion: is there a site similar to
| https://www.craiyon.com/ where I can experiment with Stable
| Diffusion?
| nickthegreek wrote:
| https://github.com/hlky/stable-diffusion
| rompic wrote:
| https://huggingface.co/spaces/stabilityai/stable-diffusion
| shafyy wrote:
| Here's one by Stability AI themselves:
| https://beta.dreamstudio.ai
| spyder wrote:
| A collection of sites using stable diffusion:
|
| https://www.reddit.com/r/StableDiffusion/comments/wzj8kk/a_c.
| ..
| bergenty wrote:
| But aren't the results from stable diffusion not nearly as good
| as DALLE2?
| astrange wrote:
| I cannot get this to work properly (in Safari). It just won't
| regenerate anything above or to the left of the image; it acts
| like I selected the opposite sides if I try it.
| dinobones wrote:
| And every time I drag that little square reticle to fill in a
| 128x128 patch of an image, you can be sure it'll be a 15 second
| API call that I'm charged $0.25 for. Yipee! Very open.
| oldstrangers wrote:
| Do you expect them to provide computing power for free?
| netr0ute wrote:
| Why can't we provide it ourselves and skip the middleman?
| rngname22 wrote:
| No one is stopping you?
| password321 wrote:
| Dalle is currently a cloud only service. How behind are
| you?
| baq wrote:
| Sir can you point us to the weight url for dalle?
| [deleted]
| cypress66 wrote:
| Did we really get to the point that anything that isn't SaaS
| seems alien?
|
| You know companies sold software that you paid once, and then
| ran as much as you wanted on your pc?
| oldstrangers wrote:
| It doesn't seem odd to me that a product that involves an
| absurd amount of data and computing power isn't an easily
| consumable commercial product available for mass download.
| d23 wrote:
| The obvious counterpoint being what stable diffusion
| _just_ released.
| zegl wrote:
| This is great news! I spent multiple hours doing this exact thing
| by hand only last week when creating new graphics for
| codeball.ai.
| jatins wrote:
| what kind of prompting is required for this?
|
| I uploaded a digital painting, selected "Edit mode", added a
| generation frame and prompted "complete the painting in frame"
| ...but it just added a completed unrelated photo related to
| painting in that frame.
| d23 wrote:
| I guess prompting that is "similar" to the image. The output
| mine gave was pretty lackluster. I had to overlap the image
| significantly, and even then it didn't seem to take into
| account enough of the context to make something that resembled
| the style close enough.
| bottlepalm wrote:
| Feels like a race to the bottom. More features, lower cost, every
| week. No idea where it'll level out, but I like it. Just bought
| some more Dalle credits today because it's so much fun. This is a
| revolution in 'art technology' it's like Steve Job's bicycle for
| the mind. Best I could do a month ago was a stick figure in MS
| Paint, but now..
| benreesman wrote:
| I share your enthusiasm for this development but curious what
| you mean by race to the bottom?
|
| There does seem to be a lot of vague angst about how this will
| affect the nascent "Prompt Engineer" career track, but I hope
| most are comfortable letting the open innovation play out a bit
| before trying to personally monetize it..
| bioemerl wrote:
| > race to the bottom?
|
| In this context it's a good race. This software seems to have
| caught fire and tons of people are playing with it and
| providing tons of crazy new tools for cheap or free.
|
| It's a race to the top for us.
| learndeeply wrote:
| It's a race to the top. New functionality is added and the
| model is improved week over week.
| aabhay wrote:
| That's still a race to the bottom if the price isn't going
| up.
| jonplackett wrote:
| That's not what a race to the bottom is. It's just
| competition, which is usually good.
| 411111111111111 wrote:
| It often feels like words are losing their meaning, with
| everyone misusing terms they don't fully understand.
|
| I don't want to be a doomer and have have surely
| unknowingly misused terms as well, but its definitely
| noticable how these originally clearly defined terms are
| getting used in entirely new ways.
|
| And it's not just with technical terms like this, it also
| applies to originally obvious terms such as racism,
| sexism etc which have lost their original meaning
| entirely
| standardly wrote:
| Also, the use of "unironically". What is going on there.
| rjtavares wrote:
| I can understand the criticism about technical terms
| (they work better if stable and precise), but regarding
| the rest: that's just how language works. You can't (and
| shouldn't) expect words to keep their original meaning
| forever.
|
| For example, the word "term" comes from the original
| latin "terminus" that means "end" or "boundary". It only
| got the meaning you used it for centuries after it was
| first used in English. See:
| https://www.etymonline.com/word/term
| 411111111111111 wrote:
| Oh, it wasn't my intention to criticize anything or
| anyone in particular with that comment.
|
| I was just pondering that our originally clearly defined
| terms are rapidly getting used in very confusing manner,
| which increases the difficulty of a discussion, as
| participants interpret words very differently.
|
| I dont think that people look up the actual definition of
| terms in a thesaurus anymore. They hear it in some
| context and create their own personal definition. It
| wasn't as obvious before the internet i think, but
| nowadays everyone is bombarded with technical terms all
| the time, which likely contributes massively to this
| increasingly fluid terminology
| jibe wrote:
| There is generally a negative connotation to race to the
| bottom. The Investopedia definition captures this:
|
| _The race to the bottom refers to a competitive
| situation where a company, state, or nation attempts to
| undercut the competition 's prices by sacrificing quality
| standards or worker safety (often defying regulation), or
| reducing labor costs._
| learndeeply wrote:
| Race to the bottom implies that they're only competing on
| price. Here, they're competing on new functionality as
| well. If DALL-E's outputs were substantially better than
| Stable Diffusion, more people would use it, even if it cost
| more.
| tough wrote:
| Price would have gone up if SD wasn't open source, look at
| the new google collab pro limitations and you have
| indications that they're loving this new wave for milking
| it properly, I just ordered a GPU to run on local.
| learndeeply wrote:
| I don't think so, Colab pro limitations are precisely
| because they weren't charging by compute unit, so they
| were over-subscribed.
| MacsHeadroom wrote:
| Stable Diffusion is arguably better, has more features, and is
| free. OpenAI can't compete with free.
|
| Even if you don't want to take the 30 seconds to set it up in a
| free Google Colab environment, the paid DreamStudio version is
| still half the price of Dalle.
| scifibestfi wrote:
| Stable Diffusion is much less of a nanny too.
|
| Amusingly it's more open in every way.
| slig wrote:
| Do you know the best Google Colab tutorial / repo?
| cmdr2 wrote:
| Hi, there are a couple of good UIs.
| https://github.com/cmdr2/stable-diffusion-ui is an easy-to-
| install and use tool, written by me (with contributions by
| many). Version 2 is in beta, which is a 1-click installer
| for Windows, no dependencies or command line needed. v2
| beta: https://github.com/cmdr2/stable-diffusion-ui/tree/v2
|
| https://github.com/hlky/stable-diffusion is another popular
| and good tool.
| slig wrote:
| Thank you!
| adamsmith143 wrote:
| I'm impressed how fast this is getting adopted. Dozens of
| repos have popped up.
| ajkshdfgkjasdh wrote:
| dreamstudio is also waaaay faster than openai. generally a
| second or two for 512x512 at 50 steps.
| skybrian wrote:
| Running this at home is only free like mining cryptocurrency
| is free if you didn't buy your computer and don't pay for the
| electricity. Plus you can only run it on the computer that
| has the good graphics card, which probably isn't your laptop.
|
| I expect most people aren't going to be generating images all
| day, so using a cloud-based service for occasional use will
| still make a lot of sense.
|
| Stable Diffusion offers a paid service to do this too, and
| there's nothing wrong with that business model. Prices will
| probably come down, though.
| bornfreddy wrote:
| Not sure if GP had this in mind, but SD is (more) free in
| terms of liberty. So yes, you pay with electricity and
| hardware, but you control the process yourself, which is
| invaluable. DALL-E could change or go offline at any time.
| skybrian wrote:
| Considering the threat from DALL-E going offline, it
| seems quite acceptable. These aren't precious photos
| since it's all made up anyway, you can download any
| pictures you make, and you probably already did for the
| ones you care about.
|
| I'd worry more about, say, keeping your photos on Google
| and losing your account somehow.
| Miraste wrote:
| It's not only the threat of going offline. DALL-E makes
| it extremely difficult to generate many ideas because of
| its absurd content blocker - for example, I had something
| like "ominous, foreboding landscape beneath a black sun"
| blocked because (from what I could tell) it has words
| with negative connotations and the word "black" in the
| same sentence. It does this all the time, their discord
| is full of examples.
| skybrian wrote:
| Yeah, if you run into those then you'll want to use
| something else. (I haven't in my casual usage.)
| Tepix wrote:
| It does run on Apple silicon. 55 seconds in M1 Pro (vs 15
| seconds on RTX 3070).
| skybrian wrote:
| That's pretty good, but with that level of latency, I can
| still see people paying to use an online service that's
| faster. Maybe they'll speed it up more, though?
| istsp wrote:
| redler wrote:
| Is this native? Or Rosetta?
| Miraste wrote:
| Native, and judging by the speed it's using Metal too (as
| opposed to CPU fallback).
| frognumber wrote:
| I find Stable Diffusion better overall, but it has downsides.
| Stable Diffusion tends to be more creative than DALL-E, but
| does a lousy job of following directions, especially complex
| ones. DALL-E is good if I know what I want specifically.
|
| I can think of ways to fix Stable Diffusion since it's open-
| source. I think I could bridge the gaps as I see them in
| about a weekend of hacking. I'm not sure when I'll get that
| weekend.
|
| (Footnote: What I want to do is not something I can explain
| without a technical blog-post-length document or a zoom call;
| it's about the same level of complexity as the other major SD
| hacks we've seen)
| Miraste wrote:
| Something like prompt weighting? I've seen implementations
| of that floating around.
| cube2222 wrote:
| Setting a high cfg parameter, like 13, drastically helps
| with the prompt following.
|
| That said, for me, I agree that dalle does much better
| pencil sketches.
| irrational wrote:
| Better in what way? I tried 10 prompts that returned good
| results in DALLE, but nothing good in stable diffusion.
| davidwparker wrote:
| Seconded. I got awesome results in making "artwork in the
| style of Yoshitaka Amano" in DALLE but horrible ones in
| Stable Diffusion. Maybe the prompt was incorrect there (it
| would be great if these were more discoverable), but they
| art in SD was lacking.
| andybak wrote:
| SD definitely needs more coaxing and naive prompts tend
| not to fare as well as with Dall-E.
| bottlepalm wrote:
| There was a good example somewhere I can't find, but of a
| really complex prompt that Dalle could understand, but SD
| couldn't. Maybe some of the GPT-3 is being leveraged for
| parsing.
|
| Anyways I think it's way too early to start taking sides. I
| enjoy using all these system.
| bioemerl wrote:
| One of SDs big limitations (understanding from what I had
| read about it) is positional prompts. dall-e seems to
| understand x on top of Y, but simple diffusion does not.
| dogcomplex wrote:
| img2img drawing should take care of that
| posterboy wrote:
| Ironically, "the cat is on the mat" is a conventional
| example sentence in linguistics of metonymy (semantics).
|
| I have no examples but imagine things like _at the top of
| his game_ are immensely problematic, albeit not very
| visual to begin with.
| adamsmith143 wrote:
| Doesnt seem to get IN examples either. E.g. a prompt like
| 'an eagle holding a snake in its beak' ends up generating
| eagle snake hybrid creatures.
| yreg wrote:
| I don't think Stable Diffusion is technologically better yet.
|
| Sure, both SD and Midjourney produce absolutely beautiful
| artworks most of the time. But if you want something specific
| and out of the ordinary it takes a lot of attempts and
| promptcrafting (and sometimes you are unable to accomplish
| what you want at all).
|
| However, my experience is that these prompts (which SD/MJ
| struggles with) often produce good results in Dalle2 even on
| the first try.
|
| Of course, OpenAI has very limiting content policy. But if I
| have something very specific in mind and it passes their
| rules I currently chose Dalle-2. Even though I've spent much
| more time with SD.
| istsp wrote:
| DeWilde wrote:
| Also, unlike DALL-E, SD comes without a content filter and
| "anti-bias diversity" filter so it gives you what you ask and
| treats you as an adult.
| iKlsR wrote:
| After many months of waiting on my invite I got it and I
| entered the prompt which is my greatest fear for some reason
| "a red eyed hairy spider with human hands as feet" I got a
| warning about violating policy/harmful content etc or
| something. Not only that, the results I got were super
| underwhelming, after playing with it for a half hour I
| haven't looked back. Now playing with SD and an upscaler,
| there is no limit to what I can create. Also I always found
| it funny the company name hilarious. "Open"AI.
| have_faith wrote:
| > Best I could do a month ago was a stick figure in MS Paint
|
| That is still the best _you_ can do... which happens to be
| about the best I can do! Just like my introduction to the
| computer at a young age has atrophied my handwriting quality.
| bottlepalm wrote:
| I guess if we're going to get into semantics and the
| definition of self, where does the 'I' end and something else
| begin then I don't really do anything. You could also say I
| can't walk either without the ground.
| dougabug wrote:
| I think he's calling you a cyborg.
| TheMagicHorsey wrote:
| I feel like you aren't using the phrase "race to the bottom"
| correctly here. Generally a race to the bottom implies some
| kind of detrimental outcome for the world as a result of people
| failing to internalize externalities generated by a business.
| bottlepalm wrote:
| It has to do with commoditization and decreasing costs.
| Taking something technologically sophisticated and having it
| become open source and accessible so quickly is going from
| the top of the pyramid - big companies gate keeping betas, to
| the bottom - the public, available to everyone, cheaply.
| These companies are desperately trying to monetize this
| technology, but the value in terms of what people will pay is
| falling fast. It might not be a sustainable business model
| for OpenAI or anyone else for very long. Hence the race to
| the bottom - quickly make a buck before you can't.
| [deleted]
| tough wrote:
| > it's like Steve Job's bicycle for the mind
|
| I have been thinking the same thing, it's sad Steve will not be
| able to see it
| fartcannon wrote:
| Steve would be trying to lock it down in his walled garden.
| guelo wrote:
| Not sure why that is more sad than all the other dead people
| that can't see it.
| criddell wrote:
| Nobody said it is more sad.
| actusual wrote:
| Lol I just want to be able to use the thing. How long is this
| waitlist?
| istsp wrote:
| dangero wrote:
| similar work using Stable Diffusion in a Photoshop plugin:
|
| https://old.reddit.com/r/StableDiffusion/comments/wyduk1/sho...
| rvz wrote:
| So DALL-E is already old news and the Stable Diffusion
| ecosystem is once again already ahead especially with this
| announcement.
|
| Quite funny to see OpenAI panicking and falling on their own
| sword, as they were supposed to be 'Open' in the first place
| and are now being disrupted by open source.
| sroussey wrote:
| I came to say something similar. It feels like "OpenAI" was
| just a trademark grab to prevent others from using it. Of
| course, all conspiracy theories work well when looking
| backward in time.
| Blackthorn wrote:
| Couldn't happen to a more deserving group of people. Good
| riddance. Squatting the name "open" and trying to reap the
| benefits therein while being anything but.
| ItsTooMuch wrote:
| I thought their research actually is open, at least? That's
| still something...
| aabhay wrote:
| Their research is closed -- they don't release model
| weights, nor in most cases the training or model scripts.
| Certain things they release, just like any other for-
| profit research firm.
| skybrian wrote:
| I don't see what it has to do with profit since this is
| pretty normal in academia too. Scientists will often
| publish papers, but not everything they do.
|
| "Open" is not well-defined.
| ItsTooMuch wrote:
| As the sibling says, in academia this is already more
| than open...
| Blackthorn wrote:
| Open means something. It is a, for lack of a better
| phrase, virtue signal. When you do that but don't
| actually represent the virtue you are trying to signal,
| people will understandably get pretty upset about that.
| JackFr wrote:
| I've been complaining for years about WikiLeaks not being a
| wiki -- no one wants to listen....
| kylebenzle wrote:
| TakeBlaster16 wrote:
| To be fair, it started out as a wiki, and they just never
| changed the name.
|
| There's no CSS here but you can clearly see the MediaWiki
| template: https://web.archive.org/web/20090422103636/http
| ://www.wikile...
| ben_w wrote:
| What benefits? The parent is non-profit.
|
| I'd argue they're imperfect, but they don't look like
| arses. Big gap between the two, too.
| scoopertrooper wrote:
| The parent may be non-profit, but OpenAI LP accepts
| investments and delivers returns to investors like any
| other regular company. The only difference is that they
| 'cap' the returns. However, the cap is negotiated with
| individuals investor and I haven't seen anything
| disclosing the cap except for the fact that in the
| opening round the cap was 100x the initial investment.
|
| 100x seems like a pretty generous cap to me.
| aabhay wrote:
| Are they non-profit? Does receiving $1b investment from a
| for-profit company still mean you can be non-profit?
| ben_w wrote:
| Yes to both questions. It's a (set of) specific thing(s)
| in company law.
|
| https://projects.propublica.org/nonprofits/organizations/
| 810...
| skybrian wrote:
| This is just nonsense. I pay (a small amount) for both. They
| have different strengths and it's fun to compare. Adding new
| features to a product is not a sign of panic, it's just
| normal.
| wongarsu wrote:
| Dall-E so far hasn't been able to grow an ecosystem because
| of how restricted it is. Meanwhile Stable Diffusion makes
| trial-and-error and innovation around it easy, and as a
| result only 9 days after Stable Diffusion's release we see
| OpenAI release a feature that looks like a copy of a tool
| from the Stable Diffusion ecosystem.
|
| I agree that Dall-E isn't obsolete. I'd also add MidJourney
| to that list. All three are great models in their own right
| with their own pronounced strengths and weaknesses. But
| when it comes to enabling novel workflows Stable Diffusion
| seems lightyears ahead of the others.
| krisoft wrote:
| Except you are wrong. This feature was already available
| as part of the Dall-e ecosystem. There was a website
| called patch-e which facilitated this exact same
| workflow.
| crypto420_69 wrote:
| Also its quite funny to see OpenAI (with all their
| researchers and engineers) get disrupted by someone with
| little to no background (Emad) in AI and ML but who embraced
| OpenAI's original mission about making AI as open as
| possible.
| robertlagrant wrote:
| > and falling on their own sword
|
| That's not what that means.
| mromanuk wrote:
| The only move left for OpenAI, is honour their name and make
| their own AI Open Source.
| shrimpx wrote:
| Or rename themselves OpaqueAI.
| ckluis wrote:
| I think Dall-E would benefit from a "sketch-based" prompt in
| addition the text based. This was mindblowing -
| https://andys.page/posts/how-to-draw/
| teddyh wrote:
| Someone should name the next image generator OWL, since it
| "draws the rest of the owl".
| boppo1 wrote:
| Cheeky:
|
| https://github.com/hlky/stable-diffusion-
| webui/blob/master/i...
| teddyh wrote:
| I thought it was odd that I hadn't seen anyone else make
| that joke. Turns out they had, I just hadn't seen it.
| Thanks!
|
| Reference, for those who haven't seen the original joke to
| which my joke was referring: https://www.reddit.com/r/pics/
| comments/d3zhx/how_to_draw_an_...
|
| (See also: https://knowyourmeme.com/memes/how-to-draw-an-
| owl)
| sho_hn wrote:
| It does feel like art's disruptive "Calculator moment" is
| happening where you can now leave a lot of basic/mechanical
| tasks to a tool and give more focus to higher-minded problems.
|
| It's going to get so cool and interesting, I think.
|
| A lot of the conversation around art may focus more on
| composition and objectives of the artist in the new prompt
| engineering world, with less bias from factors such as
| rendition quality etc. creeping in since it's so incidental.
|
| New forms of art will emerge and/or gain popularity that focus
| on trying things the tools aren't good at yet. The human artist
| of the gaps. The niches will constantly be shifting.
|
| I wonder if we'll learn to recognize the output of certain
| popular models and perceive them as instruments. "Made by xy on
| z" instead of "xz on guitar", so to speak. I remember the
| 90s/early 00s internet when it was always easy to tell when
| something had been done on Flash, just because of its line
| anti-aliasing rendition style being so distinct and familiar.
|
| The novelty will wear off, and we'll all start to feel a bit
| disappointed that the average human's imagination is pretty
| limited and novel/original ideas remain somewhat rare as the
| patterns and tropes in all the generated art emerge. It's great
| you can put the space needle where you want it and get a good-
| looking city and space ship, but how many variations of a
| cyberpunky skyline with a space ship do you need? And then
| we'll celebrate the novel stuff that does happen, as always. I
| suppose the tropes will evolve faster as the throughput goes
| up.
| posterboy wrote:
| It's not as if generative art is new. Nor is figurative
| painting relevant anymore since the invention of the camera.
| A basic _Burger joint in Gerhard Richter_ kind of style
| transfer is very much derivative. This isn 't bad in view of
| the classics, but it's more like _art-work_ to me.
|
| The true artists in this one are the coders, no doubt
| (corrolar to the inteligence debate).
|
| On the other hand, you mention an important point with layout
| but you underestimate the progress these days. Surely there
| are companies who are working on automated design beyond CAD
| ( _computer aided design_ ), eg. for specialized antenna.
|
| > we'll all start to feel a bit disappointed that the average
| human's imagination is pretty limited and novel/original
| ideas remain somewhat rare as the patterns and tropes in all
| the generated art emerge
|
| Well, one might argue that Richter's most highly priced piece
| looks a little like prehistoric art of the pleistocene. It's
| a little vain to mention it, because I can much better relate
| to the more basic form, of course. A more frequently sore
| point would be the pop music industry between professionals
| and the amateurish.
|
| Anyway, this may be thinking too big. For the time being, the
| bunch of techniques is better understood as a toolbox,
| because it will be a long time before it trumps demo-scene
| productions, for instance. Here it is the technique that
| counts more often than not. The rest is an acquired taste.
| boppo1 wrote:
| >basic/mechanical tasks to a tool and give more focus to
| higher-minded problems.
|
| >rendition quality... [is] so incidental.
|
| There's this thing in painting called 'mark making' and it
| can be the difference between an all-time-great painting and
| a throwaway portrait. Mark making speaks to every momentary
| choice of physical process a painter employs and reveals
| their thought process. For some of the greatest painters, it
| reveals their genius.
|
| Do not discount execution. Overlooking "basics" and
| "mechanics" is what results in disappointing work.
| posterboy wrote:
| Surely this too can be instrumentalized to evoke emotions,
| stylized to ease execution or faked to justify a result.
| sho_hn wrote:
| It's a fair point, and thanks for teaching me a new term!
|
| There's a lovely documentary called "Tim's Vermeer" about
| Tim Jenison's - one of the founders of NewTek, the people
| behind Video Toaster and LightWave, incidentally both tools
| that made hard visual art tasks accessible to wider
| audiences - hobby side project to prove that Vermeer used
| sophisticated optical tools to capture and copy his scenes
| from physical sets, rather than e.g. paint his famous grasp
| on lighting purely from his own mind. He builds such tools
| himself and then proceeds to successfully create his own
| Vermeer-alike painting, despite possessing very artistic
| skill himself.
|
| It's full of good ruminations (and good at sparking more)
| on tools-vs-artistry but also execution-vs-method, and
| whether designing and adopting innovative tools and the
| tedious process to use them made Vermeer less of a genius,
| or just a genius of a different kind than otherwise
| presumed.
|
| It's very accessible and doesn't require knowing anything
| in particular from the art world.
| NIL8 wrote:
| I love this idea of extending the canvas to build out the scene.
| It makes me wonder if anyone's tried using Poe's stories for
| illustrating with AI? His descriptive writing style seems ideal.
| AJRF wrote:
| The scrambling to stay relevant after Stable Diffusion is very
| very enjoyable to watch.
| siavosh wrote:
| A few weeks ago I was skeptical that this technology would get
| past the emotional response we get from procedurally generated
| game environments, but I've been convinced otherwise. The
| emotional response I get from some of the best of these images
| are novel and thought provoking. Makes me wonder what percent of
| what makes us human is now algorithmically solved....
| maxwell wrote:
| Maybe. I've been often reminded lately of Herbert Goldstone's
| "Virtuoso" (1958):
|
| http://elateachers.weebly.com/uploads/2/7/0/1/27012625/virtu...
| aidenn0 wrote:
| My wife likes impressionists and sunflowers. "A lone sunflower
| in a grassy field at sunset oil painting claude monet" plus
| stable-diffusion and a few minutes of tweaking some settings;
| she now has a new desktop background.
| boppo1 wrote:
| I actually paint and spend a lot of time looking at 'serious'
| paintings. AI hasn't even scratched the field to a trained
| eye.
|
| Doesn't mean I'm not excited though. This kind-of feels like
| I'm watching the camera or printing press being invented.
| Everyone is comparing it to fine art, but I think ultimately
| it's going in a different and bigger direction.
| aidenn0 wrote:
| What I did was, IMO, a different and bigger direction to
| fine art. I mean _I_ could tell that this wasn 't an
| impressionistic painting just given that some areas of the
| grass were too detailed. It looks "just fine" though to
| untrained eyes, which are well over 90% of the population.
|
| 1. How long would it have taken me to get good enough at
| painting to exceed what I generated in under an hour? How
| many people have the motivation to spend that time?
|
| 2. How much would I have had to pay an art student to make
| a painting better than what I generated in under an hour?
|
| Ten million sub-par Monet knock-offs didn't exist, but
| could exist very shortly at minimal cost. Even if it never
| gets any better this is already potentially disruptive, and
| the models are getting better every month.
| bottlepalm wrote:
| I've heard this a lot, luckily it's not that hard to test
| if you can really tell the difference. We need someone to
| create the 'AI Pepsi challenge' for artists to settle this.
| sleepdreamy wrote:
| We still know basically nothing about our Brain/Consciousness.
| I would say we have a lot more to explore/research
| mensetmanusman wrote:
| Our brain is apparently just a 4 gb large arrangement of
| electrical weights.
| d23 wrote:
| Not to be pedantic, but we have on the order of 100B
| neurons, and afaik each of them can be connected to
| thousands of other neurons. I assume we probably have a
| ways to go before we're encoding the amount of information
| a brain can comprehend.
| amilios wrote:
| Damn, Dall-E really lost its competitive edge overnight when
| Stable Diffusion was released. They dropped their prices across
| the board in response, but honestly I think it still isn't enough
| to save them. The magic of open-source competition.
| danielbln wrote:
| They dropped the prices for GPT-3, not for Dall-E.
| minimaxir wrote:
| Also per the email release, variations/inpainting, the trick used
| to simulate outpainting before this, now generates 4 images like
| a normal DALL-E generation instead of 3 (which was arbitrary
| anyways).
|
| I do wonder how expensive the outpainting is. I'm assuming that
| each additional step in the timelapse is a full generation, in
| which case ~15 generations is about $1 total.
| affgrff2 wrote:
| Everyone says stable diffusion is a free alternative. Where do I
| get the weights without passing a gatekeeper?
| dceddia wrote:
| They're currently hosted on Google in a way that you can
| download them via curl/wget. Here's a guide including the link:
| https://www.assemblyai.com/blog/how-to-run-stable-diffusion-...
| [deleted]
| msoad wrote:
| This is cool and useful!
|
| Putting "Girl with a Pearl Earring by Johannes Vermeer" in the
| kitchen in 2022 does not look good!
| i_like_apis wrote:
| Because depicting a woman in a kitchen is perpetuating the
| pernicious male patriarchy? Sorry we're not doing that. You
| might find some reception for this sort of thing on Twitter
| though.
| dymk wrote:
| It's 2022, women are allowed to enjoy cooking and baking just
| as much as men are.
| zoba wrote:
| I have been working on an outpainting piece (in Photoshop)
| currently 10609 x 8144. I am very pleased to see more support for
| this, though hoping it doesn't kill my current flow.
|
| Seems like it is currently not working on their site.
| dsmmcken wrote:
| The UX is evolving around AI image generation so fast, everyday
| is something new. There's so much greenfield exploration space
| for new interaction models.
|
| 6 months from now, how we interact with these models will
| probably look entirely different.
| hey_bear wrote:
| While maybe not "as good as a human" creatively, wonder when this
| matures a little more, we'll see whole art/design departments go
| to the wayside and be replaced by stuff like this...
| naillo wrote:
| I can't help but feel like they're adding this at this particular
| point since Stable Diffusion has announced they're releasing
| their 'inpainting' model next week.
| i_like_apis wrote:
| I really doubt it's related at all, though everyone would think
| it looks that way. SD has only been out a week and this feature
| would have taken much more than that to build, test, enroll
| demo users, make a webpage for, etc.
| naillo wrote:
| I can't prove it of course but it wouldn't surprise me if
| they had this pretty much done already long ago (dall-e has
| been out for several months at this point). The actual
| implementation doesn't look like it'd take more than a few
| days to code honestly (and they've got quite competent coders
| over there). Only speculation of course.
| i_like_apis wrote:
| Everything looks easier when someone else is doing it.
| [deleted]
| EddySchauHai wrote:
| Do you reckon we will have Prompt Engineers who are skilled at
| getting AI to generate what they want before long?
| bob1029 wrote:
| How long until we can run this over shows like Star Trek
| Voyager/DS9 and Seinfeld to achieve believable 16:9 scenes?
| deadbunny wrote:
| Someone has been upscaling DS9 already[1]. Obviously not
| release anywhere.
|
| Not sure I'd want them in 16:9, hd 4:3 like the other HD
| releases of TNG and TOS would do me. I understand they shot on
| video so an official true HD remaster is likely to never
| happen.
|
| 1. https://www.extremetech.com/extreme/324466-tutorial-how-
| to-u...
| henriquecm8 wrote:
| I don't have a deep understanding of how training models
| work, but I wonder if training a model with every frame of
| TNG and then outpaint it into 16:9 would work.
| jefftk wrote:
| DS9 shot on film:
| https://news.ycombinator.com/item?id=19454370
|
| But it did use a lot of early CGI that would need to be
| redone.
| kranke155 wrote:
| Why would you do that? What an awful idea. People made those
| shows in the 4:3 format, you'd just be adding fluff. This is
| like adding more description to a book so it becomes an epic
| novel instead of a novella. I'd say keep to the creators
| intent...
| d23 wrote:
| I'm inclined to agree, but if it were coherent I'd take it
| over what platforms like netflix do and chop off the tops and
| bottom of the content so it'll fit 16:9.
| joemi wrote:
| I'm not the person who suggested it, but I wouldn't mind
| having it fill my (wide)screen when watching. That said, I
| understand that some film/tv uses the frame very precisely,
| however I'm not sure that these two particular examples do
| that throughout their entire episodes. (Though I bet that in
| Seinfeld in particular it might weaken/ruin a few visual
| gags.)
| kranke155 wrote:
| Still seems to me like adding fluff - it seems a bit
| impossible to me that the AI would add anything pertinent
| to the plot. It would add "stuff" like corridors and
| background sets and maybe someone out of focus.
|
| Do the black bars actually bother you that much? You know
| there are cropped 16:9 widescreen versions of some of these
| shows (which I personally detest, but I work in the
| business of moving images).
|
| Genuinely interested in why this bothers people.
| cesis wrote:
| Next week?
| russdill wrote:
| Temporal coherence will still take a while to solve but it's
| not undoable. Making things that look correct upon closer
| inspection rather than just looking "nice" will probably take
| some degree of human curation for quite a while.
| bottlepalm wrote:
| Tesla could use some of that temporal coherence as well.
| dr_dshiv wrote:
| Anyone working on closed caption models at the moment?
| O__________O wrote:
| This was already possible with DALL-E using the inpainting
| feature going from defined image to transparent edge; this just
| automates what was a manual process before. Do wish the
| inpainting tool had more options, for example to fade a
| transparency in, since my understanding is it makes a difference;
| not to mention magic wand selection/deselection tool.
|
| In case it is not obvious, every time a user generates an
| additional section of an image using the outpainting feature, it
| costs a credit.
| ShamelessC wrote:
| Automation of manual processes is generally useful.
| ml_basics wrote:
| Yes indeed, and it shows the advantages of Stable Diffusion's
| model of just releasing the model and letting people do what
| they want with it - this was straightforward to implement
| oneself.
|
| And while OpenAI released this feature now, it's probably just
| a matter of days until even better features built on Stable
| Diffusion will be released, given how much community energy is
| focussed on it right now.
| pilotneko wrote:
| Maybe kludge it with a dithered transparency mask?
| O__________O wrote:
| Only matter of time before Adobe adds inpainting with hooks
| to local or API generative tools, using OpenAI to edit works
| like this is like transporting back to past using basic image
| editing tools.
| benreesman wrote:
| So... are we done politely coughing and looking out the window at
| the idea that the gatekeeping was motivated by altruism so that
| we can move on and just use this much better innovation model
| going forward?
|
| Various (subjectively judged) SOTAs on at least some subset of at
| least this family of tasks is changing somewhere between _daily_
| and _hourly_ right now. I 've been watching this stuff closely
| since fairly early ImageNet days and I've never seen a Cambrian
| explosion of "how the hell did that do that?" events at anything
| like this cadence.
| aantix wrote:
| Why would Google hold back on releasing Imagen if there are
| competitors that are publicly available already?
|
| Imagen isn't special anymore.
| jsnell wrote:
| A few possible theories, some might be mutually exclusive:
|
| Organizational scar tissue making them more risk averse about
| the PR risks of letting the genpop use AI generation tools, and
| create something offensive. With the safe assumption that
| Google will get blamed, not the user.
|
| Fear of government regulation on AI if they don't self-
| regulate.
|
| No need to actually release it, since this isn't the core
| business but just research. (While openai needs to actually
| create the business.) Corollary: more to lose -- a scandal
| around offensive content will not hurt openai's non-existent
| other businessess. It might make some advertisers pull their
| ads from Google.
|
| The opportunity cost of building a self serve platform is too
| high. (Can't pull in people writing those kind of apps from
| projects with more commercial importance. Can't make the ML
| researchers do that.)
|
| They misjudged how much demand there would be, and thought that
| building a platform would not be useful for a few years. And if
| it now turns out to actually be a great business it'll now take
| them a year to productionize and build a platform.
|
| Their compute requirements are so high that selling access is
| not viable, the costs are prohibitive for real users.
|
| It's not that different from e.g. self driving cars. Pretty
| obviously they had better tech from early on, but were not
| willing to take the risks that Tesla was.
| aabhay wrote:
| Google is most interested in maintaining mind-share so that
| researchers don't jump ship. They could always monetize Imagen
| through Google Cloud but are concerned about risks (NSFW, legal
| issues, bias, etc.) so would rather wait for others to step
| into the water first.
| thebeastie wrote:
| This is moving fast !
|
| Obviously, it's going to be an incredible boon for content
| creation. I suppose that in the future it'll make creating videos
| an order of magnitude easier, which will allow a single person or
| a small team to make a high quality movie where all the assets
| are generated, so that'll really give us an eye into a lot of
| people's imaginations, for better or worse.
|
| To leave a thought provoking example, what's going to happen when
| every adolescent has the ability to make a convincing deepfake?
|
| It'll put nation states in a similar position than they already
| have with crypto, where they wonder if they should ban, or
| regulate... doing nothing wont be an option.
___________________________________________________________________
(page generated 2022-08-31 23:00 UTC)