[HN Gopher] 1 week of Stable Diffusion
___________________________________________________________________
1 week of Stable Diffusion
Author : victormustar
Score : 420 points
Date : 2022-08-30 13:53 UTC (9 hours ago)
(HTM) web link (multimodal.art)
(TXT) w3m dump (multimodal.art)
| m_ke wrote:
| After playing with it for a few hours I'm sold on it soon
| replacing all blog spam media and potentially flooding etsy with
| "artists" trying to pass the renders as their own art work.
|
| Here's some of the stuff I generated: https://imgur.com/a/mfjHNgO
| lijogdfljk wrote:
| That Figma plugin is mind blowing to me. I'm also curious to see
| how the Blender integration pans out
| syntaxing wrote:
| It's really crazy how Stable Diffusion seems to be very on par
| with DALL-E and you can run it on "most" hardware. Is there an
| equivalent for GPT-3? I don't even think I can run the 2M lite
| GPT-J on my computer...
| planetsprite wrote:
| Stable Diffusion seems hyper-trained on digital art and faces.
| Dall-e feels a lot more "intelligent" and can create a far
| greater and more comprehensive diversity of images from
| different prompts.
| ManuelKiessling wrote:
| Tangential: I've set up a Discord Bot that turns your text
| prompt into images using Stable Diffusion.
|
| You can invite the bot to your server via
| https://discord.com/api/oauth2/authorize?client_id=101337304...
|
| Talk to it using the /draw Slash Command.
|
| It's very much a quick weekend hack, so no guarantees
| whatsoever. Not sure how long I can afford the AWS g4dn
| instance, so get it while it's hot.
|
| Oh and get your prompt ideas from https://lexica.art if you
| want good results.
|
| PS: Anyone knows where to host reliable NVIDIA-equipped VMs at
| a reasonable price?
| nabakin wrote:
| Any chance of releasing the source? I'd like to host my own
| instance so my discord server doesn't have to worry about
| queue times
| ManuelKiessling wrote:
| Yeah, sure: https://github.com/manuelkiessling/stable-
| diffusion-discord-...
|
| I quickly polished things and created a useful README -
| hopefully it's all correct. If not, let me know!
| nabakin wrote:
| Awesome, tysm!
| olladecarne wrote:
| One thing I noticed is that on GCP if you create a
| a2-ultragpu (Nvidia a100 80gb) and you select a spot
| instance, the price estimate goes down to $0.33 hourly
| ($240/m) which sounds really good if it's not a mistake. I
| was wondering if you could then turn a single A100 into 7
| GPUs using Multi-instance GPUs. So on an 80gb one you get 7
| 10GB GPUs (can't have 8 due to yield issues on those cards).
| I'm pretty sure that will run much slower than on the full
| instance, but not 7x slower so if you're running a larger
| service at scale this could be an option to parallelize
| things. If someone is able to get that running please let me
| know how it performs.
|
| The next thing I considered was just buying up a ton of 3060
| 12gb cards (saw a few new ones for $330) and just hosting a
| server from my house. This might be a good option if you
| don't care about speed but care about throughput.
|
| RTX 3090s are also decent in terms of price per iteration of
| Stable Diffusion. If you want to build a fast service like
| Dreamstudio I think it's the only option to be able to do it
| at a reasonable price. If you want to host these in the cloud
| using consumer RTX cards, you'll have to go with less
| reputable hosts since Nvidia doesn't allow it. I don't want
| to name any since I can't vouch for them, but there are some
| if you search. The cheapest option will be to buy them and
| host it yourself.
|
| I'm still researching what the best price/performance is for
| hosting this so if you have any findings please share.
| rexreed wrote:
| I'm experimenting with your Discord bot right now. It would
| be great to have a command that shows where your processes
| are currently in the queue or maybe the discord bot can
| update on queue position.
| ManuelKiessling wrote:
| Good idea, I'll look into it.
| rexreed wrote:
| I submitted 2 /draw requests with prompts, got quoted a
| time 15-30 min for first one and then 17-34 for 2nd,
| submitted about 5 minutes apart but it's been now past
| the upper limit of the quoted time without any results.
| I'm assuming that the image generation has failed or
| perhaps the bot got stuck. Having some way of knowing
| would be helpful.
| ManuelKiessling wrote:
| Not stuck, just a full queue. Results will come back
| sooner or later. Time is really just a guesstimate.
| rexreed wrote:
| Just got one of the images back. Looks like you might
| want to double your time estimates. Also I got a Rick
| Roll meme image back as one of the results. I assume this
| is some sort of failure mode response?
| ManuelKiessling wrote:
| Rick chimes in when the AI thinks the image might be
| NSFW.
| rexreed wrote:
| Well that would have been a fun interpretation of my
| prompt ;)
| juliensalinas wrote:
| I worked on the Stable Diffusion and GPT-J integrations on NLP
| Cloud (https://nlpcloud.com/). Both can be used in FP16 without
| any noticeable quality drop (in my opinion). Stable diffusion
| requires 7GB of VRAM on a Tesla T4 GPU. GPT-J requires 12GB of
| VRAM (but if you really try to use the 2048 tokens context, the
| VRAM will go up and reach something like 20GB of VRAM).
| karolist wrote:
| I have 64GB RAM and nVidia with 24GB vmem, which projects could
| be the limit I can run locally?
| pdntspa wrote:
| Personally I don't find it on-par or even close to DALL-E...
| stylistically its output is a lot more plain (Midjourney does
| really well here) and it can't handle complicated prompts well
| (it will pick the one thing in the prompt it does know about
| and run with it, ignoring all else)
|
| Plus, there are huge gaps in training. Ask it to draw something
| simple, like "a penis" and you get nightmare fuel....
| [deleted]
| boppo1 wrote:
| Does DALL-E let you output penises? I thought openAI was
| forbidding many 'unseemly' prompts.
| xor99 wrote:
| This is the killer aspect of it. Running an image in a <5 mins
| on a Mac is amazing when you consider the alternatives atm.
| macrolime wrote:
| GPT-3 isn't really all that optimized in terms of size. Later
| studies have shown that you don't need that many parameters to
| get the same results, so it should be possible to train a model
| that could run at least on something like an RTX 4090 Ti with
| 48GB ram.
| acapybara wrote:
| Full GPT-6B can run if you have 22gb ram (CPU or GPU depending
| on where you run it).
|
| Also can run an 8 bit quantized version pretty easily. This
| takes ~6gb RAM.
|
| The results seem far off from GPT-3 but apparently it can get
| good results when fine tuned.
|
| Bigger models like OPT 66B can run on cloud machines (or a
| really big local system)
|
| OPT 175B weights are not open but can be applied for.
|
| 175B would require something like 500GB RAM if not quantized.
| That's a lot, but it's possible to build that locally if you
| have a couple 10's of thousands of dollars.
|
| Wait a few years and 175B on a GPU will be no problem.
| boppo1 wrote:
| What does 'quantized' mean in this context?
| acapybara wrote:
| Basically stuff a 32 bit value into an 8 bit value (and
| lose precision).
|
| Apparently it doesn't affect the results significantly.
|
| More info:
|
| https://github.com/huggingface/transformers/pull/17901
| fab1an wrote:
| I think most are vastly underestimating the impact of Synthetic
| AI media - this is at least as big as the invention of
| film/photography (century-level shift) and maybe as big as the
| invention of writing (millenia-level shift). Once you really
| think through the consequences of the collapse of idea and
| execution, you likely to tend think the latter...
|
| What we're seeing now are toy-era seeds for what's possible -
| e.g. I've been making a completely Midjourney-generated
| "interactive" film called SALT:
| https://twitter.com/SALT_VERSE/status/1536799731774537733
|
| That would have been completely impossible just a few months ago.
| Incredibly exciting to think what we'll be able to do just one
| year from now..
| deviner wrote:
| It doesn't bring anything new, just enhanced on top of what
| already exists, not even close to photography or film.
| seydor wrote:
| As big as the invention of CGI
|
| Still, humans use art to communicate intent, and we still
| consider AIs to be 'things' , no agency or intent. Being an
| artist just became a lot harder, because no amount of technical
| prowess can make you stand out. It s all about the narrative
| now
| wonnage wrote:
| Art has always been about more than technical prowess; it's
| fundamentally an exploration of new ways to tickle neurons.
|
| For more practical purposes like product design, anyone will
| tell you that actually drawing stuff is akin to typing in
| code, it can take a while but it's not the hard part
| schroeding wrote:
| > we still consider AIs to be 'things'
|
| Do we, though? You and I do, sure. Most people here will,
| probably. But at least one counter-example was on display a
| few weeks ago, the guy from Google that told the press that
| their text completion engine was " _alive_ " and " _had
| agency_ ".
|
| From my friends I talked about this (which are not in IT),
| most believed him. YMMV, but I seriously believe a good chunk
| of the population thinks we already have thinking A(G)I. I
| don't think there is a "we" here, anymore. :/
| seydor wrote:
| Do you think you 'll see an AI in jail anytime soon? If
| not, then it's nowhere nearly "a good chunk of the
| population"
| schroeding wrote:
| Do we see pets in jail? No, but most people wouldn't say
| they are things (even though, in German law, they
| literally are), they are more or less intelligent beings
| with agency.
|
| I don't see the correlation, to be honest. A good chunk
| (but still a minority) of the people believing something
| doesn't automatically change the law, anyway, does it? :/
| seydor wrote:
| pets do get euthanized/ muzzled if they are found
| 'guilty' against humans because we do think they have
| agency . Sometimes we blame the owner for things they
| could have done, but there are cases where it's beyond
| their control. For the same reason we don't jail kids but
| sometimes they do get punishment. We do thing kids and
| pets have agency, but not full agency.
|
| At the current stage i don't think there is any AI that
| can be punished, or anyone that would credibly claim that
| an AI must be punished. Its maker will always be punished
| instead.
|
| Well true but i think that chunk is quite small. It's one
| thing to nonchalantly say "this is alive" and a very
| different thing when you have to deal with the
| consequences.
| schroeding wrote:
| > It's one thing to nonchalantly say "this is alive" and
| a very different thing when you have to deal with the
| consequences.
|
| Yeah, fully agreed! Chatbots and "creative" ML systems
| are in the weird spot where they can't physically kill or
| hurt people, like e.g. a self-driving car, and perform
| tasks that "feel" like they need intelligence.
|
| It's also absolutely quite possible that the "chunk" is
| way smaller than I think, I'm just blindly extrapolating
| from my social bubble :D
| nerdponx wrote:
| Is it? I seriously doubt it.
|
| Other than "you can't trust anything you don't see with your
| own eyes", what kind of shift is it? People lived like that for
| literally millennia before photography, audio, and video
| recording.
|
| At absolute worst, we are only undoing about 150 years of
| development, and only "kind of" and only in certain scenarios.
|
| Moreover, people were making convincing edits of street signs,
| etc. literally 20 years ago using just Photoshop. What does
| this really change at a fundamental level? Okay, so you can
| mimic voices and generate video, rather than just static
| images. But people have been making hoax recordings and videos
| for longer than we've had computers.
|
| I think the effects of this stuff will be: 1) making it
| easier/cheaper to create certain forms of art and entertainment
| media, 2) making it easier to create hoaxes, and 3) we will
| eventually need to contend with challenges to IP law. That's
| about it. I think it will create a lot of value for a lot of
| people (sibling comment makes a good point about this being
| equivalent to CGI), but I don't see the big societal shift
| you're claiming that this is.
| tracerbulletx wrote:
| Being able to just make a mental storyboard of ideas, and
| have that be trivially easy to turn into a finished product
| will transform who can express, and how they will express
| ideas and stories and art. Now you can do images. Will we
| also be able to do voice actors, video, 3d Assets, and even
| story beats and dialogue? It seems quite possible. Everyone
| just became, or is becoming, a director with 1000 cast
| members, concept artists, and crew at their disposal.
| planetsprite wrote:
| flycaliguy wrote:
| I think your perspective is really sharp and hits the
| important points. I think you might be right but I also think
| we are witnessing a sort of merger between an internet
| history and physical history. It's playing out in a lot of
| ways but mostly through the impact of social media on civil
| discourse. We are watching technology collide with the real
| world.
|
| It just feels to me like the internet has arrived in a way
| that can best be expressed in an Adam Curtis documentary.
| hackernewds wrote:
| Photoshop has always existed. Yet we haven't lost the era of
| not trusting pictures. What is new?
|
| I think we're making the folly of comparing AI generated art
| to human generated art from the 90s. Humans have "advanced"
| much further with the advancements now that DALL-E is nowhere
| close to.
|
| https://youtu.be/iKBs9l8jS6Q
| jhbadger wrote:
| And even earlier than Photoshop or computers. The concern
| over this reminds me of the era shortly after the
| development of photography where it was discovered that
| doing things like multiple exposure allowed the creation of
| "trick" photographs where you could combine images to
| create pictures of giants towering over buildings or tiny
| people playing in teacups. Society managed not to be
| falsely convinced of the existence of such beings despite
| the worry that people wouldn't be able to tell fact from
| fiction in them.
| kertoip_1 wrote:
| > people were making convincing edits of street signs, etc.
| literally 20 years ago using just Photoshop
|
| You needed to have tools, skill, resources and time to do
| such things. You don't need to have that anymore. Anyone can
| do anything on any scale.
|
| It's something what SpaceX did. Ofc it was possible to launch
| a rocket before SpaceX, but few really could afford that. Now
| that prices are low, that opens infinite number of new space
| exploration possibilities.
| time_to_smile wrote:
| > You don't need to have that anymore. Anyone can do
| anything on any scale.
|
| I have yet to see an example of "Synthetic AI media" that
| was both realistic and not immediately recognizable as
| being synthetically generated.
|
| And if you think being 99% there means we're very close to
| 100% just remember how long it's taken self driving cars to
| close the gap (we actually don't know how long since they
| still haven't succeeded in this).
| pretendscholar wrote:
| >I have yet to see an example of "Synthetic AI media"
| that was both realistic and not immediately recognizable
| as being synthetically generated.
|
| Would you know if you had?
| gpm wrote:
| > I have yet to see an example of "Synthetic AI media"
| that was both realistic and not immediately recognizable
| as being synthetically generated.
|
| Man, am I a good photographer or what
|
| https://imgur.com/mUoY4b1
|
| I mean, probably if you're familiar enough with squirrels
| something gives it away, but I'm not.
| gpm wrote:
| That said, I do think it does better with less realistic
| images right now, like
|
| https://imgur.com/mcJsg0n and https://imgur.com/41eENUO
|
| It also took a number of much worse images to get those
| ones.
| borski wrote:
| Okay, but Imgur marked that second one as erotic, so for
| every two steps forward... haha
| gpm wrote:
| Lol, it didn't tell me that.
|
| In case anyone is worried, it isn't remotely erotic or
| otherwise nsfw.
| butwhywhyoh wrote:
| You're right. I guess if you can't tell the difference,
| that means everyone who claims to be able to tell the
| difference is lying!
|
| I'm not particularly familiar with squirrels and
| something about that "photo" looks very off. If you
| showed it to me in a vacuum I'd just assume someone was
| trying to make a highly stylized version of something
| they had a photo reference for, but under no
| circumstances would I believe that's a real photo.
| gpm wrote:
| I'm not accusing anyone of lying, I'm suggesting that
| they might not have fully seen what this technology is
| capable of yet. I'm sure that I haven't. The whole point
| of the post we are discussing this under is the rate of
| progress in the space.
| simonw wrote:
| > I have yet to see an example of "Synthetic AI media"
| that was both realistic and not immediately recognizable
| as being synthetically generated.
|
| Three months ago I'd probably have agreed with you.
| Things have changed.
| nerdponx wrote:
| > You needed to have tools, skill, resources and time to do
| such things.
|
| Downloading a cracked copy of Photoshop and checking out a
| book from the library on how to edit photos is only
| somewhat more difficult than learning to use Python and
| write programs that generate art from some model. And only
| because learning _anything_ is extremely easy today with so
| many free resources and help forums.
|
| > Anyone can do anything on any scale.
|
| I'll believe it when I see it.
|
| > Now that prices are low, that opens infinite number of
| new space exploration possibilities.
|
| Except SpaceX is still in the "crawl" phase of "crawl,
| walk, run", and they only got even that far because because
| an eccentric billionaire has staked his reputation on the
| problem and thrown a huge amount of money at it, without
| having to worry about things like "reporting to Congress"
| and "making sure the space program creates jobs in such-
| and-such voting district". And after all that effort and
| truly astounding engineering (the rocket _lands itself_
| back on the launch pad!!), space launches are _still_
| expensive, risky, and complicated, and will remain so into
| the foreseeable future (~decades).
| kertoip_1 wrote:
| > they only got even that far because because an
| eccentric billionaire has staked his reputation on the
| problem and thrown a huge amount of money at it
|
| That's what Bezos did with its Blue Origin. If you check
| where is Blue Origin in space race, you'll quickly
| realize it's not enough
|
| Edit: ah, and is space still expensive? If one of
| universities in my middle-sized country with no space
| engineering background could afford to launch cubesat via
| SpaceX, then yes, I think it became cheap.
| nerdponx wrote:
| Did Bezos stake his reputation on Blue Origin? I think
| it'd be a lot less embarrassing for him if Blue Origin
| folded than it would be for Musk if SpaceX folded.
| nextaccountic wrote:
| > Python
|
| Most people using those models aren't writing Python
| code. Check out https://www.reddit.com/r/dalle2/,
| https://www.reddit.com/r/midjourney/,
| https://www.reddit.com/r/StableDiffusion/,
| https://www.reddit.com/r/bigsleep/ etc
|
| I expect that once the technology matures, a smaller and
| smaller niche of users will be doing any kind of
| programming
| viscanti wrote:
| > Downloading a cracked copy of Photoshop and checking
| out a book from the library on how to edit photos is only
| somewhat more difficult than learning to use Python and
| write programs that generate art from some model.
|
| How much practice would one need after doing that, before
| they're able to match the quality of some of the AI
| generated art? Not all of the AI generated artwork is
| perfect, but some of the art would take the average
| person years of practice to be able to match. Some art
| requires more than a cracked copy of Photoshop and a
| weekend of reading a book you borrowed from a library.
| You may be surprised to find that some people spend years
| honing their craft.
| fatherzine wrote:
| "opens infinite number of new space exploration
| possibilities"
|
| In practice, "infinite" translates mainly to a handful of
| hyper-competitive guys checking off "went to space" off
| their achievements list. There is a very good reason for
| that: space is very inhospitable, much more inhospitable
| than Antarctica. Nothing much has happened in Antarctica
| for 100 years beyond the occasional hyper-competitive
| athlete and a few research stations. Perhaps a natural
| resource gold rush might liven up the place for a few
| decades, until exhaustion and falling back to inhospitable
| status, dotted with the rare ghost town remains.
|
| Something similar happens in the "creative" space: the
| Internet unleashed a massive tidal wave of "content", yet
| the vast vast majority of it is rather trite and devoid of
| any (spiritual) meaning. Personally, I'm much more inclined
| to stick with the classics than even 20 years ago, simply
| because it's not worth my time wading through the deluge of
| poor quality "content" out there. To wrap up the analogy,
| I'd rather inhabit a nutritionally rich environment, than
| getting lost in the the vast, but mostly empty, expanse of
| the Internet.
| borski wrote:
| I see your point, and I raise you SoundCloud rap, tiktok,
| and virality.
|
| A lot of the "trite" internet creations have gone on to
| become absolutely massive songs or artists.
| jjeaff wrote:
| I wouldn't say "a lot". A handful at most.
| borski wrote:
| New music is funneled through the internet now, and
| that's how things get launched. The old mode is dying.
| njarboe wrote:
| Nothing happens in Antarctica because the world powers
| signed a treaty in 1959 causing that to happen[1]. If it
| was open land that was allowed inhabited and owned by
| people forming new governments, I would bet you would see
| settlements sprout up there quite quickly. In space you
| might be able to set up new sovereign entities. That is
| one major reason people will want to go there.
|
| [1] https://www.nsf.gov/geo/opp/antarct/anttrty.jsp
| simonw wrote:
| Even if the vast majority of content on the internet is
| "rather trite and devoid of meaning", there's so much out
| there that even if just 0.05% of it is any good then
| that's a vast amount of new high quality content to enjoy
| and learn from.
|
| I would take today's internet-fuelled media landscape
| over the landscape of 20 years ago in a heart beat.
| fatherzine wrote:
| I am fairly torn on this topic. The statement is mostly
| an admission that I'm too weak to not personally waste
| too much time on trite Internet content.
|
| The question I often ask myself: is spending time with
| this content, while entertaining in the short term,
| perhaps via the novelty factor, also nourishing in the
| long term? The answer is, sadly, much more frequently NO
| than in the time of printed books.
|
| The best I can hope is to be able to use Internet as an
| encyclopedia for laser-focused lookups. Sadly, I am too
| often caught in browsing random content only loosely
| related to the original lookup topic.
| anonAndOn wrote:
| This thought occurred to me recently while skimming the
| formulaic and indistinguishable programming on Netflix. It
| won't be long before a GPT-3 script is fed to an image
| generator and out comes the components of a movie or TV show.
| The product will undoubtedly need some human curation and voice
| acting, but the possibility of a one-person production studio
| is on the horizon.
| nerdponx wrote:
| And it will probably suck just as bad as any of the low-
| effort formulaic movie or music that humans like to produce.
| flycaliguy wrote:
| It makes me wonder if all these musician's catalogs that have
| been purchased as a whole lately are even more powerful than
| before. Owning a piece of every bit of media that contains a
| slice of David Bowie for example would be extremely valuable.
|
| Consider the gaming world's concept of "whales". Customers
| willing to spend disproportionately enormous amounts of money
| in game. Can you sell these whales a unique, personalized
| David Bowie album that is about, I don't know, maybe the
| customer's own life story?
| cercatrova wrote:
| You're right, it's already on Show HN right now.
| Melatonic wrote:
| Doubt it - but it will become another great tool for artists to
| use.
| kasperni wrote:
| Take a look at https://www.reddit.com/r/midjourney/ if you want
| to see what midjourney is capable of. Some of them are
| extremely impressive [1][2][3][4]
|
| [1]
| https://www.reddit.com/r/midjourney/comments/x0kv8s/testp_ju...
|
| [2]
| https://www.reddit.com/r/midjourney/comments/wz1am0/homer_si...
|
| [3]
| https://www.reddit.com/r/midjourney/comments/x10som/the_amou...
|
| [4]
| https://www.reddit.com/r/midjourney/comments/x12nqz/robert_d...
| paisawalla wrote:
| Agreed, it really does not seem far off now to imagine a world
| where I can request artifacts like
|
| "This episode of Law & Order, but if Jerry Orbach never left
| the show"
|
| "Final Fantasy VII as an FPS taking place in the Call of Duty
| universe"
|
| "A 3D printable part that will enable automatic firing mode for
| {a given firearm}"
| time_to_smile wrote:
| > toy-era seeds
|
| I think what we have is a toy and will remain a toy, just like
| Eliza was 60 years ago. Academically fascinating, and given the
| constraints of the era, genuinely remarkable, but still a long
| way from really being useful.
|
| I'm already getting bored of seeing 95% amazing 5% wtf AI
| generated images, I can't fathom how anyone else remains
| excited about this stuff so long. My slack is filled with
| impressive-but-not-quite-right images of all sorts of
| outrageous scenarios.
|
| But that's the catch. These diffusion models are stuck creating
| wacky or surreal images because those contexts are essential
| allowing you to easily ignore how much these generates miss the
| mark.
|
| Synthetic AI media won't even been as disruptive as photoshop,
| let alone the creation of written language.
| flycaliguy wrote:
| No matter what happens, it sure is thrilling to witness this
| debate. You might be right, you might be wrong.
|
| Personally I think a line can now be drawn that starts at the
| first cave drawing and ends in 2022. Something has
| fundamentally shifted, a true paradigm shift before our eyes.
| magicalhippo wrote:
| > I've been making a completely Midjourney-generated
| "interactive" film called SALT
|
| I stumbled over Midjourney the other day through these music
| videos[1][2] generated by Midjourney from the songs lyrics, and
| I immediately thought we're not far away from this being viable
| for a cartoon-like film.
|
| Interesting times ahead.
|
| [1]: https://www.youtube.com/watch?v=bulNXhYXgFI
|
| [2]: https://www.youtube.com/watch?v=KVj_AEhpVbA
| [deleted]
| andruby wrote:
| It must be interesting being a graphic artist in 2020-2022. First
| NFT's that enabled some to make millions of dollars. Less than 2
| years later, Stable Diffusion, which will probably shrink the
| market significantly for human graphical artists.
| poisonborz wrote:
| Just imagine - you could write your own script of a series and
| have it realistically generated, especially cartoons, complete
| with voice acting. Popular generated Spongebob episodes could
| form canonical entries in the mind of the general public - after
| some information fallout, original episodes couldn't be even told
| apart. Postmodern pastiche will accelerate and will become total.
| danso wrote:
| Tangent discussion: What are people's here experiences with
| running Stable Diffusion locally? I've installed it and haven't
| had time to play around, but I also have a RTX 3060 8GB GPU --
| IIRC, the official SD docs say that 10GB is the minimum, but I've
| seen posts/articles saying it could be done with 8GB.
|
| Mostly I'm interested in the processing time. Like, using a
| midrange desktop, what's the average time to expect SD to produce
| an image from a prompt? Minutes/Tens of minutes/Hours?
| sarsway wrote:
| It's pretty fast on a RTX 3070 (8GB), a few seconds per image.
|
| My first impression is it seems a lot more useful then DALL-E,
| because you can quickly iterate on prompts, and also generate
| many batches, picking the best ones. To get something that's
| actually usable, you'll have to tinker around a bit and give it
| a few tries. With DALL-E, feedback is slower, and there's
| reluctance to just hammer prompts because of credits.
| foobiekr wrote:
| I was blown away when I got DallE access, but now it seems
| almost silly by comparison. I really wonder why the DallE
| team chose to expose so few controls.
| Morgawr wrote:
| I have a Titan X (Pascal) from like 2015 with 12GB of vram and
| I've had no trouble running it locally. I'd say it takes me
| about 30 seconds maybe to generate a single image on a 30ddim
| (which is like the bare minimum I consider for quick
| iterations), when I want to get more quality images after I
| focus on a proper prompt, I set it to like 100 or 200 ddim and
| that maybe takes 1 minute for one picture (I didn't accurately
| measure). I usually just let it run for a few minutes in bulk
| of 10 or 20 pictures while I go do something else then come
| back half 15-20 minutes later.
|
| It runs pretty well but the most I can get is a 768x512 image,
| but it's pretty good for stuff like visual novel background
| art[0] and similar things.
|
| [0] - https://twitter.com/xMorgawr/status/1564271156462440448
| wccrawford wrote:
| I had to get a different repo with "optimized commands" on the
| first day, but my 3070 8GB has been happily processing images
| in decent time.
| nbutyllithium wrote:
| I decided to set up a local instance yesterday with my 3070
| TI 8GB and had similar success, about 10 seconds per image at
| the default settings. Like you I also opted for a different
| repo [0] which emphasized adding a GUI but I think also opts
| out of the watermark addition/other checks. Sounds like it
| reduces memory usage from what others have said. Had more
| trouble coming up with creative prompts then getting set up
| surprisingly (to me anyway).
|
| [0] https://github.com/hlky/stable-diffusion
| nickthegreek wrote:
| This helped me with prompt generation:
| https://promptomania.com/stable-diffusion-prompt-builder/
| wccrawford wrote:
| I found it very easy to set up, too. I had a previous
| couple things I set up that were a _lot_ harder to set up.
| Stable Diffusion has been dreamy. I 'm already tempted to
| upgrade my setup to one of these with the GUIs, but I think
| if I wait just a bit longer, it's going to get even better.
| So I'm resisting the urge.
| cbozeman wrote:
| Removing the NSFW and watermark modules from the model will
| easily allow you to run it with 8 GB VRAM (usually takes around
| 6.9 GB for 512x512 generations).
|
| With an RTX 3060, your average image generation time is going
| to be around 7-11 seconds if I recall correctly. This swings
| wildly based on how you adjust different settings, but I doubt
| you'll ever require more than 70 seconds to generate an image.
| chrismorgan wrote:
| ASUS Zephyrus G15 (GA503QM) with a laptop 3060 (95W, I think)
| with 6GB of VRAM, basujindal fork, does 512x512 at about 3.98
| iterations per second in turbo mode (for which there's plenty
| of memory at that size). That's under 15 seconds per image on
| even small batches at the default 50 steps, and I think it was
| only using around 4.5GB of VRAM.
|
| (I say "I think" because I've uninstalled the nvidia-dkms
| package again while I'm not using it because having a
| _functional_ NVIDIA dual-GPU system in Linux is apparently too
| annoying: Alacritty takes a few seconds to start because it
| blocks on spinning up the dGPU for a bit for some reason even
| though it doesn't use it, wake from sleep takes five or ten
| seconds instead of under one second, Firefox glyph and icon
| caches for individual windows occasionally (mostly on wake) get
| blatted (that's actually mildly concerning, though so long as
| the memory corruption is only in GPU memory it's _probably_
| OK), and if the nvidia modules are loaded at boot time Sway
| requires --unsupported-gpu and my backlight brightness keys
| break because the device changes in the /sys tree and I end up
| with an 0644 root:root brightness file instead of the usual
| 0664 root:video, and I can't be bothered figuring it out or
| arranging a setuid wrapper or whatever. Yeah, now I'm
| remembering why I would have preferred a single-GPU laptop, to
| say nothing of the added expense of a major component that had
| gone completely unused until this week. But no one sells what I
| wanted _without_ a dedicated GPU for some reason.)
| orangecat wrote:
| I'm using the fork at https://github.com/basujindal/stable-
| diffusion which is optimized for lower VRAM usage. My RTX 2070
| (8 GB) takes about 90 seconds to generate a batch of 4 images.
| mlsu wrote:
| I have a dated 1070 with 8gb of vram, some of which also
| renders my desktop.
|
| I was able to obtain 256x512 images with this card using the
| standard model, but ran into OOM issues.
|
| I don't mind waiting, so now I am using the "fast" repo:
|
| https://github.com/basujindal/stable-diffusion
|
| With this, it takes 30s to generate a 768x512 image (any larger
| and I am experiencing OOM issues again). I think you should
| expect a bit faster at the same resolution with your 3060
| because it's a faster card with the same amount of memory.
| cube2222 wrote:
| RTX 3080 (10GB) here
|
| Keep in mind to have the batch-size low (equal to 1, probably),
| that was my main issue when I first installed this.
|
| Then, there's lot's of great forks already which add an
| interactive repl or web ui [0][1]. They also run with half-
| precision which saves a few bytes. Additionally, they
| optionally integrate with upscaling neural networks, which
| means you can generate 512x512 images with stable diffusion and
| then scale them up to 1024x1024 easily. Moreover, they
| optionally integrate with face-fixing neural networks, which
| can also drastically improve the quality of images.
|
| There's also this ultra-optimized repo, but it's a fair bit
| slower [2].
|
| [0]: https://github.com/lstein/stable-diffusion
|
| [1]: https://github.com/hlky/stable-diffusion
|
| [2]: https://github.com/basujindal/stable-diffusion
| [deleted]
| folli wrote:
| Funny of Reddit banning the mentioned Subs in a short amount of
| time.
|
| Some years ago, the pendulum was very much on the other side.
| yreg wrote:
| Is it known what killed those subs? Was it content based on
| actual people (celebrities)?
| hbn wrote:
| I thought I saw someone mention that a Vice article linked to
| them, and possibly reddit didn't want people thinking they're
| going to be hosting a repository of "fake nudes of non-
| consenting people"
|
| I took a quick look at the subreddit before it was banned and
| I don't think I saw any real people represented. It was a lot
| of video game or anime style characters. And one of Shrek
| with a massive dong.
| andybak wrote:
| Mentioned where? The linked article only mentions Reddit once
| and that link resolves fine.
| LordDragonfang wrote:
| I think this comment was meant to be a reply to the Vice
| article posted elsewhere in this thread
| SXX wrote:
| Reddit want to become public company so it's very much expected
| result.
| frozencell wrote:
| Strange how the same specie who killed its cofounder are the
| same who lead it now.
| desindol wrote:
| Oh come it's the same with twitter he liked the prospect of
| making lots of money and now he screams foul.
| hackernewds wrote:
| who?
| optimalsolver wrote:
| He was a co-founder in name only. He was forced on the
| actual founders by Paul Graham.
| stephc_int13 wrote:
| AI generated art is interesting and will probably be helpful.
|
| I see it as a cheap and fast alternative to paying a concept
| artist.
|
| But not a revolution. Creating precise and coherent assets is
| going to be a challenge, at least with the current architecture.
|
| From a research perspective this is, I think, much more than a
| toy, those models can help us better understand the nature of our
| minds, especially related to their processing of text, images and
| abstraction.
| amelius wrote:
| I think what it shows us that activities that we think of as
| "human", like getting drunk, saying silly things that sound
| brilliant, or painting things that look stunning are actually
| the things that a machine has least trouble to copy.
|
| Whereas things we associate more with computers, such as hard
| thinking, mathematics, etc. turn out to be more difficult to
| copy by a machine, and therefore perhaps more "human".
| cududa wrote:
| I've dismissed DALL-E - very cool, but won't really replace
| everyone. After playing with Stable Diffusion, as an artist,
| this is the most profound experience I've ever had with a
| computer. Check this out https://andys.page/posts/how-to-draw/
| timost wrote:
| One use case I have in mind is manga drawing. I wonder if anybody
| has tested manga related generation.
| ronsor wrote:
| You can coax it to generate whole manga pages. The only
| downside is the text and story is incoherent, and the
| characters are inconsistent.
| ebabchick wrote:
| can someone recommend a good paper or blog post with an overview
| of the technical architecture of training and running stable
| diffusion?
| dreamcompiler wrote:
| https://ommer-lab.com/research/latent-diffusion-models/
| [deleted]
| motoboi wrote:
| Take a moment to appreciate the fact that in 4,2Gb (less than
| that actually) you have the English language somehow encoded.
|
| This is mind blowing.
| CWuestefeld wrote:
| I've been playing with it a bit, and I also find the
| information theory aspect absolutely amazing. It's more than
| just the English language that's encoded there. It's also
| encodes information about characters and the styles of
| countless artists. I just cannot fathom how all this
| information fits in that space.
| nickthegreek wrote:
| Agreed. My wife plays animal crossing, so I had stable
| diffusion do some animal crossing prompts and was blown away.
| This 4gb file understand how all the textures on the objects
| in this game should look. Then I turned around and was
| generating liches on thrones in the styles of the painting
| masters. This is absolutely mind blowing to me.
| revskill wrote:
| 7 days trying to install python and their packages and failed.
| Have to remove those garbages , global dependencies from my
| machine. Such a waste of ecosystem.
| andybak wrote:
| I know several semi-non-technical people that have got this
| running locally.
| revskill wrote:
| Yes, i wish i had same luck as theirs. Sometimes, i think
| they're genius!
| andybak wrote:
| If you're on Windows this is by far the easiest way:
| https://softology.pro/tutorials/tensorflow/tensorflow.htm
|
| Mostly-automated installer.
| gigel82 wrote:
| I'm using the Docker one, so much easier and no worries of
| polluting my real environment (all the installation scripts
| tend to download a variety of things from a variety of places).
| CWuestefeld wrote:
| It took me some time to get the OpenVivo distribution running
| on my Windows box. It turns out that it wasn't compatible with
| Python 3.10, I had to go back to 3.9. Maybe that'll help you?
| akshayKMR wrote:
| Try this one with the docker image instead:
| https://github.com/AbdBarho/stable-diffusion-webui-docker
| marc_io wrote:
| I found it surprisingly easy to run it on a 2015 MacBook Pro.
| gregsadetsky wrote:
| How long does it take you to generate an image? What
| setup/fork are you using for this? Thanks
| digitallyfree wrote:
| Openvivo Stable Diffusion (CPU port of SD) is a easy install on
| Linux within a venv. Be sure to update Pip first before
| installing the required packages from the list. The lack of GPU
| acceleration and the associated baggage makes this much easier
| to set up and run.
|
| https://github.com/bes-dev/stable_diffusion.openvino
| drexlspivey wrote:
| Someone on reddit made a single self contained .exe with a GUI
| (haven't tested it)
| https://old.reddit.com/r/StableDiffusion/comments/wwh1s9/jus...
| throwaway888abc wrote:
| The collaboration,pace and progress is stunning. If this can
| applied to other fields such climate change etc.
|
| Great write up
| gillesjacobs wrote:
| 7 days and already that many UIs, plugins and integrations
| released. To be fair, developer/researcher access was a bit
| earlier but that is impressive adoption speed.
| RcouF1uZ4gsC wrote:
| > 7 days and already that many UIs, plugins and integrations
| released.
|
| That's because you can use it to make porn. Don't underestimate
| the motivational power of being able to easily create porn.
| cbozeman wrote:
| This is the easy answer, but I don't think this is the right
| answer.
|
| The right answer, I'd argue, is that this was Prometheus
| giving fire to the mortals, and then the mortals quickly
| discovered everything that could be possible with fire.
| [deleted]
| xor99 wrote:
| Haha, yes it is the energy that sustained the internet after
| all
| metadat wrote:
| They forgot to mention the porn one..
|
| https://www.vice.com/en/article/xgygy4/stable-diffusion-stab...
|
| Why'd they "overlook" it? Probably more culturally significant
| and controversial than any of the others. It's the natural
| elephant.
| Workaccount2 wrote:
| I will not even be slightly surprised when in 10 years we get
| stats like
|
| "60% of all image generation compute power used for making NSFW
| material"
| chestervonwinch wrote:
| idiocracy is becoming increasingly prophetic
| kgwgk wrote:
| I would expect the % of image generation compute power used
| for making NSFW material to come down over time, just like
| the % of home video minutes or digital photographs used for
| making NSFW material went down over time.
| metadat wrote:
| Or maybe something like "60% of all power on earth".
|
| I can't yet decide if it's going to be extremely appealing or
| quickly get [even more] boring and repetitive.
| codetrotter wrote:
| As someone who stopped looking at porn about two months
| ago, after years of porn use, I can tell you: it's going to
| be highly subjective.
|
| I used to watch porn basically daily. But then after
| finally deciding to stop watching porn, the idea of porn
| itself is downright off putting to me. I don't even quite
| know how or why. It just is.
|
| And I imagine it will be the same for others with AI
| generated porn.
| ajsnigrutin wrote:
| Technically you could get porn of that one exact weird
| turnon you have, that there is literally zero porn existing
| now (except if you pay for a costum video/photoshot).
|
| Is this good? maybe... maybe not. Since most of the
| "normal" stuff already exists, it'll either be something
| "too extreme" for classic porn studios, or stuff using non
| porn people to turn into ai-porn stars.
| gpm wrote:
| My 3 year old GPU generates one image every ~5 seconds, and
| according to the documentation draws 215 watts at a maximum
| steady state load (TDP). That's very roughly a kilowatt
| second per image.
|
| The internet tells me that a 2022 honda civic takes about
| 0.07 liters of gas per km. And it also tells me that that
| is equivalent to 2394 kilowatt seconds. I.e. 2394 images on
| a 3 year old GPU per km travelled using a new and fuel
| efficient model of car...
|
| I'm not worried about this consuming a significant fraction
| of the power on earth.
| guhidalg wrote:
| I agree with your analysis, but I don't think that's how
| most people interpret global compute power consumption.
| The stored energy in the gasoline is not counted in
| global energy production figures, but the electricity
| used to power your GPU is.
| fifticon wrote:
| I appreciate your energy comparison, and it makes sense
| to me.
| DeRock wrote:
| I mean, its not hard to imagine where this goes next:
| video (and eventually 3d/VR scenes). Say 60fps -> 2394
| frames/60fps = 40 seconds. That's equivalent to driving
| your car at (3600 seconds per hour / 40 seconds / km) ==
| 90 km/hr. Yes, your GPU is older, but there will also be
| pressure to increase the resolution and fidelity of the
| generated content to match.
| atq2119 wrote:
| It really feels like for 3D the better quality/compute
| trade-off would be to have the ML model generate 3D
| models and animations, and then use a more traditional 3D
| rendering pipeline (by then with ray tracing and
| denoising).
| gpm wrote:
| Ok, sure, maybe there's demand there. But how would the
| logistics work out so that it managed to become a
| problem?
|
| Are people
|
| a) Waiting 5 seconds/frame * 60 fps = 5 minutes / second
| for a video to generate on their personal computer, and
| doing this constantly enough that it manages to become a
| problem?
|
| b) Buying computers that can do it real-time, but
| therefore output vastly more heat, requiring thermal
| management system akin to a car driving at highway speed?
|
| c) Renting these computers at considerable cost to make
| these videos?
|
| As long as enough people watch each video (or one person
| watches it enough times), the energy usage washes out to
| become negligible compared to the amount of human time
| invested. I just can't see a world where enough people
| are managing to consume a kw minute/second producing
| videos for themselves to watch only once or twice that it
| becomes an issue.
|
| Personally I'm optimistic that energy/compute is going to
| continue going down substantially (in which case even
| real-time video generation might not be an issue). If it
| doesn't and we don't become substantially better at
| efficiently synthesizing video, I can't see personalized
| single use video generation being a thing.
| DeRock wrote:
| > a) Waiting 5 seconds/frame * 60 fps = 5 minutes /
| second for a video to generate on their personal
| computer, and doing this constantly enough that it
| manages to become a problem?
|
| There will be much more compute resources thrown at it to
| make it render in real time. We're not there yet, but I
| can see a path to that happening in the next few years.
|
| > b) Buying computers that can do it real-time, but
| therefore output vastly more heat, requiring thermal
| management system akin to a car driving at highway speed?
|
| Why not? We already have billions of cars driving around
| outputting heat. Its an incredible expenditure of energy,
| sure, but perhaps the value of generated content
| entertainment will match the value of car transportation.
|
| > c) Renting these computers at considerable cost to make
| these videos?
|
| I imagine longer term, the opex (i.e energy costs) will
| dominate the capex (GPU HW). The price of going into a
| generated world could be similar to going for a drive.
|
| > As long as enough people watch each video (or one
| person watches it enough times), the energy usage washes
| out to become negligible compared to the amount of human
| time invested. I just can't see a world where enough
| people are managing to consume a kw minute/second
| producing videos for themselves to watch only once or
| twice that it becomes an issue.
|
| This is where I strongly disagree. The democratization of
| skills and tools in creating content will break the one
| to many media model. You saw this in a large way in what
| the internet did to content distribution, in how the
| number of independent people creating content
| skyrocketed. These models will do the same for content
| creation. I predict most people will consume content
| personally generated for themselves or in small groups.
|
| Here's an example: a group of friends puts on their VR
| headsets for their weekly DnD session. The DM begins
| describing the scene, which autogenerates around them.
| Each character can then respond with their own actions /
| path, and the scenes react dynamically. The hour session
| costs them $10 in compute/energy.
|
| I'm mostly spitballing. I would imagine that we still
| have a couple of orders of magnitude reduction in energy
| costs that can be squeezed out of these models with
| improvements in specialized HW. But it will be matched
| against the insatiable demand of consumers for richer
| interactivity in content.
| prophesi wrote:
| I also wouldn't be surprised if 60%+ of the training data is
| NSFW when it's not filtered out.
| amelius wrote:
| > "60% of all image generation compute power used for making
| NSFW material"
|
| Or 80% of all NSFW viewing happens at work.
| ShamelessC wrote:
| > Why'd they "overlook" it?
|
| It appears to be a site for AI art, so there's that.
| isatty wrote:
| This will be the majority use case for tools like this. I
| suppose this also extends rule 34 - even if it does not exist,
| there will be porn of it.
| polisteps wrote:
| Hi, I'm the creator of multimodal.art, I didn't overlook it,
| but there's no "specialized" NSFW content maker to be
| highlighted - this Vice articles just show people using the
| model in different iterations to generate NSFW content; you
| don't need a specialized notebook/tool for that, a few ones on
| the post can do it (others have a NSFW filter that comes in by
| default).
|
| Additionally it is important to note that model was licensed
| under the OpenRAIL-M LICENSE which is not as permissive as an
| MIT license and forbids certain outputs to be shared or
| purposes to be built as apps
| LordDragonfang wrote:
| >there's no "specialized" NSFW content maker to be
| highlighted
|
| Unless I'm misunderstanding you, yes there is, and it was
| even posted on HN last week:
|
| https://news.ycombinator.com/item?id=32572770
|
| (And yes, many of its results are horrifying)
| GaggiX wrote:
| No, this is just Stable Diffusion without any modification,
| no fine-tuning is needed to create nudes.
| lxe wrote:
| I thought only the "derivatives of the model" are under the
| user restrictions in the license, and are very permissive.
| The outputs of the model are very briefly covered in the
| license text
|
| > You are accountable for the Output you generate and its
| subsequent uses. No use of the output can contravene any
| provision as stated in the License.
| coding123 wrote:
| In 30 years everything AI generates will be a red circle, because
| at that point it will have just trained on itself repeatedly.
|
| Instead of labeling data for what things are, we'll have to label
| things as being generated or not.
| ok_dad wrote:
| I guess we know where the new market for all those Ethereum
| miners' GPUs will come from. I have always been sort of bear-ish
| on the trend towards throwing GPU power at neural nets and their
| descendants, but clearly there are amazing applications for this
| tech. I still think it's morally kinda wrong to copy an artist's
| style using an automated tool like this, but I guess we'll have
| to deal with that because there's no putting this genie back in
| the bottle.
___________________________________________________________________
(page generated 2022-08-30 23:02 UTC)