[HN Gopher] Open-source rival for OpenAI's DALL-E runs on your g...
       ___________________________________________________________________
        
       Open-source rival for OpenAI's DALL-E runs on your graphics card
        
       Author : hardmaru
       Score  : 215 points
       Date   : 2022-08-15 15:49 UTC (7 hours ago)
        
 (HTM) web link (mixed-news.com)
 (TXT) w3m dump (mixed-news.com)
        
       | humanistbot wrote:
       | If anyone from Stability is reading, the confirmation e-mail to
       | sign up is sending a broken link:
       | 
       | "We couldn't process your request at this time. Please try again
       | later. If you are seeing this message repeatedly, please contact
       | Support with the following information:
       | 
       | ip: XXXX
       | 
       | date: Mon Aug 15 2022 XX:XX:XX GMT-0700 (Pacific Daylight Time)
       | 
       | url: https://stability.us18.list-manage.com/subscribe/confirm"
        
         | wccrawford wrote:
         | It worked for me just now, so maybe it was temporary, or they
         | already fixed it?
        
         | hifikuno wrote:
         | I also had this response.
        
       | kgc wrote:
       | Does this work on Apple silicon processors? They have plenty of
       | RAM accessible to the GPU.
        
         | sroussey wrote:
         | The articles says it will, but that it is not using the GPU
         | unfortunately.
        
       | axg11 wrote:
       | I'm excited for the coming race to improve and miniaturise this
       | tech. Apple has a great track record of making ML models light
       | enough to run locally. There will come a day when photorealistic
       | image generation can run on an iPhone.
        
         | andrewacove wrote:
         | Maybe this is their long term plan for getting rid of the
         | camera bump.
        
       | humanistbot wrote:
       | Can someone tell me how this compares to the guide and repo
       | shared a few days ago on HN:
       | https://news.ycombinator.com/item?id=32384646
        
         | Geee wrote:
         | There's also Disco Diffusion:
         | https://www.reddit.com/r/DiscoDiffusion/
         | 
         | Not sure how they compare. DD seems to be quite popular. I'm
         | currently setting up DD locally.
        
           | glenneroo wrote:
           | I've been running DD for a few months now... I tend to just
           | edit the python script or use e.g. entmike's fork which can
           | read config files to make changes to the 50+ parameters
           | (basically everything is better than having to use Jupyter
           | Notebooks IMO), granted if you don't have a GPU with 6+ GB of
           | VRAM, you can often get a decent enough GPU for free from
           | Google Colab. For running locally, I can also highly
           | recommend Visions of Chaos, which includes multiple
           | versions/forks of Disco Diffusion, as well as a ton of other
           | latent diffusion scripts, not to mention many many many other
           | generation features such as fractals and even music. They
           | also recently added the ability to train your own diffusion
           | models which I've been doing the last few days using
           | thousands of my own photographs. It also has a pretty nice
           | GUI and the dev is extremely responsive on Discord. Also
           | after you do the setup for VoC it handles running all the
           | python venv setup stuff otherwise necessary with local DD
           | installs. In any case, check out DD Discord and/or VoC
           | discord for lots of info, tips, help, examples, and support.
        
             | Geee wrote:
             | Thanks for the info. It is possible to do something like
             | transfer learning on top of existing models or do you train
             | your own models from scratch? I'll check out that Vision of
             | Chaos thing. I'm just beginning my journey into this
             | generative art stuff and just basically trying to get this
             | running right now.
        
         | Voloskaya wrote:
         | This version is a bit more optimized, and better packaged. Also
         | the model has been trained longer, so when the weights become
         | publicly available the resulting quality should be much higher.
        
       | PoignardAzur wrote:
       | > _Of course, with open access and the ability to run the model
       | on a widely available GPU, the opportunity for abuse increases
       | dramatically._
       | 
       | > _"A percentage of people are simply unpleasant and weird, but
       | that's humanity," Mostaque said. "Indeed, it is our belief this
       | technology will be prevalent, and the paternalistic and somewhat
       | condescending attitude of many AI aficionados is misguided in not
       | trusting society."_
       | 
       | Holy shit.
       | 
       | On the one hand, I'm super excited by this technology, and the
       | novel applications that will become possible with these open-
       | source models (stuff that would never be usable if Google and
       | OpenAI had a monopoly on image generation).
       | 
       | On the other hand, I really really _really_ hope Bostrom 's
       | urn[0] has no black ball in it, because we as a society seem to
       | be rushing to extract as many balls as possible over increasingly
       | short timescales.
       | 
       | [0] https://nickbostrom.com/papers/vulnerable.pdf
        
         | mortenjorck wrote:
         | The length of the democratization cycle we're seeing - months
         | to weeks between a breakthrough model and a competent open-
         | source alternative that runs on commodity hardware - really
         | highlights the genie-stuffing posture of Google and OpenAI. All
         | the thoughtful, if highly paternalistic guardrails they build
         | in amount to little more than fig leaves over the possible
         | applications they intend to close off.
         | 
         | I'm personally in the "AI risk is overstated" camp. But if I'm
         | wrong, all the top-down AI safety in the world is going to be
         | meaningless in the face of a global network of researchers,
         | enthusiasts, and tinkerers.
        
           | hedora wrote:
           | They claim the guardrails are for public good, but they're
           | pretty clearly using them to try to establish a competitive
           | moat.
           | 
           | It's similar to the "we don't sell personal information"
           | claim. Sure, but that's because they make money renting
           | malicious actors access to a black box that contains your
           | personal information. Selling the contents of the box would
           | reduce their overall revenue.
        
           | TulliusCicero wrote:
           | To me it seems like an obvious case of reputational risk
           | being much larger for more prominent organizations than for
           | smaller ones.
           | 
           | It makes sense for Google to wait for some startup to "go
           | first" in releasing a model largely without controls. That
           | way, some random startup takes the initial heat of "people
           | are using AI for bad things!!" headlines plastering tech
           | blogs. Then Google can do basically the same thing a little
           | bit later, and any attack pieces will sound old hat.
        
           | narrator wrote:
           | I was using AI Dungeon with the full power GPT-3 model before
           | they crippled it. That thing had a very uninhibited mind for
           | erotica! Imagine what would happen when that power comes to
           | image models!
        
         | Iv wrote:
         | People will generate creepy porn and fake pictures. Humanity
         | will survive this.
        
           | forty wrote:
           | Exactly, and there are many pros for humanity to this too:
           | people will be able to make funny pictures and things like
           | that, so it's not like it's a bad deal.
        
           | PoignardAzur wrote:
           | You're not addressing my broader point, though, just the
           | easy-to-snide-at version of my point.
           | 
           | Yes, it's pretty obvious that Dall-E and similar models won't
           | destroy humanity.
           | 
           | My point isn't that Dall-E is a black ball. My point is _we
           | better hope_ a black ball doesn 't exist at all, because the
           | way this is going, if it exists, we _are_ going to pick it,
           | we clearly won 't be able to stop ourselves.
           | 
           | (For the sake of dicussion, we can imagine a black ball could
           | be "a ML model running on a laptop that can tell you how to
           | produce an undetectable ultra-transmissible deadly virus from
           | easily purchased items")
        
         | jackblemming wrote:
         | Yes, I feel much safer if OpenAI and Google are the sole
         | keepers of such technology. They have my and the publics best
         | interest at heart.
        
           | PoignardAzur wrote:
           | Let me put it this way: it's not great to live in a world
           | where the immense majority of nukes are controlled by Donald
           | Trump and Vladimir Putin.
           | 
           | But it's arguably better than living in a world where every
           | single citizen has a nuke.
           | 
           | (Though the potential for harm of diffusion models is far
           | below nuke; it's not "kill millions of people", it's "produce
           | cheap disinformation and very convincing fake evidence to
           | ruin someone's life")
        
             | nightski wrote:
             | I think that would be a tough argument to make (in regards
             | to image generation). The same could be said of just about
             | any computing technology. The problem is we lose out on a
             | lot of potential good.
             | 
             | Either way it doesn't matter, you can't control bits like
             | you can enriched uranium. It's just a matter of time. In
             | the grand scheme of things Open AI will be irrelevant.
        
           | geraldwhen wrote:
           | Is this satire?
        
             | pizza wrote:
             | Yes
        
               | ljlolel wrote:
               | Googlers and OpenAI legitimately believe this
        
         | CM30 wrote:
         | I don't see why this is incorrect. It seems ever since DALL-E,
         | Midjourney, caught on, it seems like we've got more and more
         | people trying to 'filter out' incorrect uses of their software
         | under the assumption people cannot be trusted to just use it
         | for whatever they want.
         | 
         | And it depresses me, because well... imagine if other pieces of
         | tech were treated this way. If the internet or crypto or
         | computers or whatever were heavily limited/restricted so the
         | 'wrong people' couldn't use them for bad things. We'd consider
         | it ridiculous, yet it's somehow accepted for these image
         | generation systems.
        
           | dinosaurdynasty wrote:
           | Nuclear tech is treated this way (much stricter even).
        
             | CM30 wrote:
             | I think there might be at least a small difference between
             | nuclear tech and image generation, at least as far as the
             | effects that could happen if it goes wrong.
        
         | manquer wrote:
         | Wouldn't nuclear weapons or even plastics be a black ball
         | already?
         | 
         | Humanity is not homogeneous, we will always react to new
         | inventions or tools differently , many will use it positively
         | some won't . Short of weapons of mass destruction I am not sure
         | anything else will destroy civilization itself .
        
           | ALittleLight wrote:
           | No, the black ball is a technology that, once invented,
           | humanity cannot survive. Nuclear weapons have been invented
           | and humanity is surviving. Same with plastics.
           | 
           | A black ball would be like - suppose nuclear weapons ignited
           | the atmosphere. We test the first nuke, it ignites the
           | atmosphere, a global fire storm consumes all breathable
           | oxygen, kills all plants and everyone on the surface and
           | everything else suffocates shortly after. Plastics aren't
           | even close to this level of harm.
        
             | neuronic wrote:
             | > Same with plastics
             | 
             | Plastics are playing the long game. They have to turn into
             | micro- and nanoplastics first and may then enact undesired,
             | unforeseen biological functions, just like BPA [1].
             | 
             | Not even talking about weaponizing this stuff...
             | 
             | [1] https://pubmed.ncbi.nlm.nih.gov/21605673/
        
               | ALittleLight wrote:
               | It seems implausible to me that plastics are going to
               | kill all humanity. If the paper you linked makes that
               | case, then I will read it, but I didn't get that from
               | skimming the abstract.
        
             | michaelt wrote:
             | I guess people are getting confused because in terms of
             | risk-of-destroying-humanity, nuclear weapons seem higher
             | risk than DALL-E.
        
               | PoignardAzur wrote:
               | Nuclear weapons alone aren't an existential risk. There
               | are far fewer nuclear weapons in the world than there are
               | major cities, among other things.
               | 
               | Dall-E isn't an x-risk, but an advanced AI might be
               | (though _a lot_ of people have their opinion on that
               | part).
        
             | tablespoon wrote:
             | >> Wouldn't nuclear weapons or even plastics be a black
             | ball already?
             | 
             | > No, the black ball is a technology that, once invented,
             | humanity cannot survive. Nuclear weapons have been invented
             | and humanity is surviving. Same with plastics.
             | 
             | I don't think that definition is a good one. Technological
             | civilization [1] has survived nuclear weapons for ~80
             | years, but there's no guarantee it will survive it for
             | another 80 years, let alone forever. It seems like these
             | "black balls" should be though of like time bombs, there
             | are at least two variables: how much destruction it will
             | cause when it goes off _AND_ the delay time before that
             | happens. We shouldn 't confuse a dangerous technology with
             | a long delay time for a safe technology. My intuition tells
             | me that there will probably be nuclear war at some point
             | over the next 1,000+ years.
             | 
             | [1] I don't think nuclear weapons can make humanity
             | extinct, so long as there are still little poorly-connected
             | subsistence communities in remote areas. However, if The
             | Market, manages to extend its tentacles into every human
             | community, we're probably fucked.
        
               | ALittleLight wrote:
               | The term comes from Nick Bostrum's article on The
               | Vulnerable World [1]. In it he defines a black ball like
               | this "a black ball: a technology that invariably or by
               | default destroys the civilization that invents it".
               | Nuclear weapons don't invariably destroy civilization
               | because we could just imagine that we keep using them as
               | is - that's not impossible. Also, Bostrum considers
               | nuclear weapons explicitly and calls them a gray ball.
               | 
               | 1 - https://nickbostrom.com/papers/vulnerable.pdf
        
         | krono wrote:
         | Either we equalise chaos or we reduce chaos, there exist no
         | other options for entropy incarnate.
        
         | sinenomine wrote:
         | What if the black ball was a red herring all along and the
         | usual suspect tech-CEO's hand(s) rushing to control the said
         | crystal ball are the real hazard?
        
       | dustingetz wrote:
       | doesn't build on my mac studio due to a dependency whose mac
       | version is two major versions behind
        
       | upupandup wrote:
       | My friend wants to know when she can use this to generate porn,
       | are we close?
        
         | planetsprite wrote:
         | You'll need to train your own model, though I'm sure if someone
         | manages to crowdsource it there's a very obvious economic
         | incentive.
        
         | Mountain_Skies wrote:
         | Is there a way to make money selling the model to people who
         | want to use it to make porn? If so, it will trickle down
         | relatively quickly. If not, it'll still eventually trickle down
         | but will take longer.
        
           | TigeriusKirk wrote:
           | Surely you could sell custom prompt runs for porn for a great
           | deal more than OpenAI is charging for generalist custom
           | prompts.
           | 
           | Making money at it should be easy, and places like PornHub
           | wouldn't care about any outrage. The real challenge would be
           | limiting criminal and civil liability, at least to my not-in-
           | the-business thinking.
        
           | upupandup wrote:
           | This is already a thing in the kpop fake porn industry. I
           | don't know how Patreon/Onlfans are allowing this to happen, I
           | mean it's a travesty that highly suggestive lyrics and
           | stripper dance moves in scantily clad kpop idols are being
           | used for sexual gratification
        
           | TaylorAlexander wrote:
           | Once training can be done on a beefy home rig folks will be
           | all over it.
        
         | _blop wrote:
         | This article hints that Stable Diffusion can at least generate
         | normal looking nude women:
         | https://techcrunch.com/2022/08/12/a-startup-wants-to-democra...
         | 
         | There are attempts to gather porn images and train or fine-tune
         | existing networks on it, here's a recent attempt by an art
         | student mentioned in the article above (NSFW!!):
         | https://www.vice.com/en/article/m7ggqq/this-furry-porn-ai-ge...
        
           | isoprophlex wrote:
           | Jesus H Christ those are some seriously cursed hindquarters
        
             | SV_BubbleTime wrote:
             | I was a better person this morning for not knowing that
             | furries had the term "hindquarters". I mean, that's fine
             | for other people, you do you, but for me, I was better this
             | morning.
        
         | isoprophlex wrote:
         | You'll have to train it on your own data. as others have
         | mentioned the training data for dall-e, stable diffusion etc
         | has been cleaned prior to training.
         | 
         | However, if it is possible to re-start the training process
         | from the weights of a non-sexually aware model, this finetuning
         | might not take all that long..!
        
         | GaggiX wrote:
         | It is possible to generate nudes, but not pornographic ones.
        
         | TulliusCicero wrote:
         | Props for just coming out and saying it
        
         | Voloskaya wrote:
         | The data used to train those model is specifically filtered to
         | remove sexual content, so the model can't generate porn because
         | it has no idea what it looks like, beyond a few samples that
         | made it past the filter.
         | 
         | So no, your "friend" can't use it for that.
        
           | [deleted]
        
           | upupandup wrote:
           | Why is it that sexual content is so frowned upon in this
           | space? If it's a content publishing platform I would
           | understand that advertisers don't want that, but this is
           | literally dictating people what is bad and good. I just don't
           | understand this Puritan outrage with text-to-image porn
           | generation.
        
             | blowski wrote:
             | I'd guess that, for general purpose companies, it's an area
             | full of legal ambiguity and potential for media outrage, so
             | just not worth the risk. However, given the evidence of
             | human history, it's certain that someone with an appetite
             | for exploiting this niche will develop exactly that kind of
             | tool.
        
             | spywaregorilla wrote:
             | Because it's a lot more annoying for your innocuous content
             | to be rendered as porn when the ai happens to interpret it
             | that way than it is for you to be unable to render your
             | pervy desires intentionally.
             | 
             | A porn model should really be it's own thing.
        
             | Voloskaya wrote:
             | Because you can't control what the model is going to output
             | in response to a query. The model is trained to respond in
             | a way that is aligned but there is no guarantee.
             | 
             | Since we certainly don't want to show generated image of
             | porn or violence to someone that didn't specifically ask
             | for that, the easiest way to ensure that's not going to
             | happen is to just not train on that kind of data in the
             | first place. The worst that can happen with a model trained
             | on "safe" images is that the image is irrelevant or makes
             | no sense, meaning you could deploy systems with no human
             | curator on the other end, and nothing bad is going to
             | happen. You lose that ability as soon as you integrate
             | porn.
             | 
             | Also with techniques like in-painting, the potential for
             | misuse of a model trained on porn/violence would be pretty
             | terrifying.
             | 
             | So the benefits of training on porn seems very small
             | compared to the inconvenience. I don't think it's anything
             | to do with puritanism, it's just that if I am the one
             | putting dollars and time to train such a model I am
             | certainly not going to be taking on the added complexity
             | and implications of dealing with porn to just to make a few
             | people realize their fetishes at the risk of my entire
             | model being undeployable because it's outputting too much
             | porn or violence.
        
               | upupandup wrote:
               | > porn/violence would be pretty terrifying.
               | 
               | uh have you seen American/European mainstream
               | pornography? it's already pretty violent (ex. face
               | slapping, choking, kicking, extreme bdsm).
               | 
               | I just don't see why this stuff is allowed and protected
               | by the law (if its not recorded and published its
               | illegal) and then we are suddenly concerned about what
               | text can do.
               | 
               | Just one of the many double standards I see in Western
               | society.
        
               | alach11 wrote:
               | One example risk is someone using computer-generated
               | content to extort money, demand ransom, etc. The cheaper
               | and easier this becomes, the more likely it is to be
               | weaponized at scale.
        
               | upupandup wrote:
               | but wouldn't the ability to auto-generate blackmail
               | material mean the value of blackmail would fall? Just
               | from a supply and demand perspective, it makes sense to
               | me why a deepfaked _kompromat_ would put serious discount
               | on such material especially if everybody knows it was
               | generated by an AI.
               | 
               | Someone like Trump would just shrug and say the pee tapes
               | are deepfaked. I don't think its possible for AI to
               | bypass forensics either. So again this narrative that
               | "deepfake blackmail" would be dangerous makes no sense.
        
               | Voloskaya wrote:
               | > uh have you seen American/European mainstream
               | pornography? it's already pretty violent
               | 
               | That's not at all what I am talking about. What I am
               | saying is that such a model would give everyone the
               | ability to create extremely realistic fake images of
               | someone else within a sexual/violent context, in one
               | click, thanks to inpainting. This can become a
               | hate/blackmail machine very fast.
               | 
               | Even though Dalle-2 is not trained on violence/porn it
               | still forbids inpainting pictures with realistic faces
               | that have been uploaded by users to prevent abuse, so now
               | imagine the potential with a model trained on
               | porn/violence.
               | 
               | Someone is eventually going to do it, but back to your
               | initial question about why it's still not done yet, I
               | believe it's because most people would rather not be that
               | someone.
        
             | gs17 wrote:
             | I imagine a large part of it is that it could generate
             | photorealistic child porn (also "deepfake" porn of real
             | people) and there's not really a good way to prevent it
             | entirely while also allowing generalized sexual content
             | AFAIK. There's probably some debate on how big a problem
             | this really is, but no one wants their system to be the one
             | with news stories about how it's popular with pedophiles.
             | It was the issue they had with AI Dungeon.
        
               | pixl97 wrote:
               | Correct me if I'm not wrong, but in many countries even
               | simulated child porn is illegal. A model spitting that
               | out could be legally problematic.
        
               | tnzk wrote:
               | Do they remove certain political or religious ideas which
               | is considered illegal in somewhere as well?
        
             | djbebs wrote:
             | Because the law makes it very difficult to provide such
             | services in the spirit of preventing the exploitation of
             | minors.
             | 
             | Make no mistake this is an indirectly legal hurdle.
        
         | GistNoesis wrote:
         | I did a show HN about this
         | https://news.ycombinator.com/item?id=31900095 a month ago, to
         | experiment with the technology. The training was done in a
         | week-end only, with 2 old gpus (1080ti).
         | 
         | Currently waiting to scale-up for improve quality mainly for
         | economic reasons, not quite sure I could recoup the training
         | costs yet. Even more so if I go with cloud training.
         | 
         | NVidia will release the 4090 in september, and ethereum may do
         | "the merge" that will make GPU useless for mining so GPU price
         | could be affordable so I can update my home cluster with
         | affordable 3090s. (But electricity prices are also up).
         | 
         | Also there are new algorithms every month like the stable
         | diffusion, that would obsolete your previous training.
         | 
         | The video generation cost is probably still too expensive
         | compared to just paying a cam girl in a low wage country. But
         | it will probably go down soon.
         | 
         | This is also some sensitive data, as plagued with copyright
         | issues, so it's quite troublesome to legally share training
         | datasets to share costs.
         | 
         | It also has its own challenges with respect to custom dataset
         | creation with text description, so it's probably a better idea
         | to adapt the algorithm to the currently available data to keep
         | the costs low.
         | 
         | Finally once someone releases a model, in the next month there
         | will be at least 3 clones.
         | 
         | There is also the problem to find an adult friendly payment
         | processor.
         | 
         | And the multitude of potential legal issues.
         | 
         | But it's probably inevitable.
        
       | colordrops wrote:
       | Yet another "open" model that isn't open. We shall see if they
       | actually do release to the public. We keep seeing promises from
       | various orgs but it never pans out.
        
         | TulliusCicero wrote:
         | Their plan seems less hand wavey, they're being explicit with
         | "first we release it like this, then like that, then freely to
         | everyone".
         | 
         | You're right that they could always change their minds and that
         | would suck, but so far they seem to be being up front.
        
       | dang wrote:
       | Recent and related:
       | 
       |  _Stable Diffusion launch announcement_ -
       | https://news.ycombinator.com/item?id=32414811 - Aug 2022 (37
       | comments)
        
       | 999900000999 wrote:
       | Has anyone made a pixel art generator, that can create the
       | animation sprites ?
        
         | gxqoz wrote:
         | You can use DALL-E and other models to make pixel art ("as
         | pixel art"), although it can both be overkill and hard to get
         | consistent results that you'd put into animation. I'm guessing
         | that starting from more of a video model and then converting to
         | pixel art could be better. Although it's also non-trivial to
         | turn "realistic" video into convincing animation.
        
           | 999900000999 wrote:
           | I pay good money for a specialized machine learning algorithm
           | that can take a pixel art character, and then generate all
           | the animated sprites for it.
           | 
           | I actually tried to get Dalle to do this, And it made like
           | three good sprites in the rest were just broken. But it was
           | so strange, because you could see it was still organized as a
           | sprite sheet, it's just the sprites were useless.
           | 
           | I think the practical applications of this technology will be
           | hyper specialized models for specific purposes.
        
         | 0xdead1eaf wrote:
         | Check out NUWA-Infinity[0][1], submitted to arxiv jul 20, 2022.
         | It captures artistic style very well (though can't speak to the
         | quality of the pixel art it would generate) and can do image to
         | video.
         | 
         | [0] https://nuwa-infinity.microsoft.com/#/ [1]
         | https://arxiv.org/abs/2207.09814
        
         | tckerr wrote:
        
         | _w1kke_ wrote:
         | @KaliYuga did - she got hired by StabilityAI just a few days
         | ago. Here is a link to the Pixel Art Diffusion notebook:
         | 
         | https://colab.research.google.com/github/KaliYuga-ai/Pixel-A...
        
       | stuckinhell wrote:
       | This is pretty amazing, anyone have any tips on building a pc for
       | machine learning with a RAID device ?
        
         | cellis wrote:
         | Look into building an ethereum mining machine... it can double
         | as an ML workstation. That's what I did.
        
         | hedora wrote:
         | If you just want to try it out, consider using a remote cad
         | workstation from a company like paperspace.
         | 
         | (No affiliation.)
        
         | jessfyi wrote:
         | Hasn't been updated since 2020, but Tim Dettmer's guide [0] is
         | pretty much the gold standard for optimizing what to buy for
         | which area of DL/ML you're interested in. The pricing has
         | changed thanks to GPU prices coming back down to earth a bit,
         | but what to look out/how much ram you need for which task
         | hasn't. Check out the "TL;DR advice" section then scroll back
         | up for detail info on _why_ and common misconceptions. For the
         | tips on a RAID /NAS setup alongside it, just head to the
         | datahoarders subreddit and their FAQ.
         | 
         | [0] https://timdettmers.com/2020/09/07/which-gpu-for-deep-
         | learni...
        
       | fswd wrote:
       | Unfortunately it's a commercial license and the model isn't
       | available to the public so it isn't very useful.
        
         | AgentME wrote:
         | Isn't that just temporary until the public release? Or is the
         | article misleading by calling it open source?
        
         | andybak wrote:
         | It's going to be MIT from what I have heard. On phone atm so
         | can't provide sources.
        
           | fswd wrote:
           | https://stability.ai/research-access-form
           | 
           | So far, I haven't got a response but it's a Monday
           | 
           | Here's the license
           | 
           | https://github.com/CompVis/stable-
           | diffusion/blob/main/LICENS...
        
             | andybak wrote:
             | That's just a restricted interim release. The proper public
             | release isn't ready yet. No timescale but sounds like
             | days/weeks rather than months/years.
        
               | fswd wrote:
               | like OpenAI?
        
               | andybak wrote:
               | Not sure I follow. OpenAI are not claiming they are going
               | to release their model at all. The team behind Stable
               | Diffusion have currently kept every promise they've made.
               | 
               | (And if you're insinuating something, just come out and
               | say it so people can engage appropriately)
        
       | ruuda wrote:
       | The site shows a notification in German that I need to enable
       | JavaScript to use the site, after the first paragraph. But then
       | after that is the full article, including images, which is almost
       | perfectly readable, except it's at 5% opacity (or maybe the
       | JavaScript popup is 95% opacity overlaid on the article), which
       | makes it impossible to read again. :'(
        
         | [deleted]
        
       | thorum wrote:
       | If you want to see more examples of what this AI is capable of,
       | check out the subreddit:
       | 
       | https://reddit.com/r/stablediffusion
        
       | unethical_ban wrote:
       | Wait, so the closed source generator known as DALL-E is owned by
       | a company called OpenAI?
        
         | alephxyz wrote:
         | It's a bit of a dead horse at this point but yes. See the
         | previous discussion:
         | https://news.ycombinator.com/item?id=28416997
        
         | fariszr wrote:
         | As Elon said "OpenAI should be more Open IMO"
        
           | glenneroo wrote:
           | Curiously it seemed to lock down even more after they
           | "partnered" with Microsoft.
        
       | keepquestioning wrote:
       | We are heading into uncharted territory :(
        
       | belltaco wrote:
       | Article says it needs 5.1GB of Graphics RAM.
       | 
       | Does any one know how much data download and disk storage does it
       | need?
        
         | luismmolina wrote:
         | If you read directly from the site. The requirements for the
         | graphic card are 10 VRAM as a minimum. Because it's ruins
         | locally you don't need to download anything apart from the
         | initial model, this applies to the disk space too.
        
         | _blop wrote:
         | The v1.3 model weighs in at 4.3 GB. There's an additional
         | download of 1.6 GB of other models due to usage of
         | huggingface's transformers (only once on startup). And the
         | conda env takes another 6 GBs due to pytorch and cuda.
         | 
         | Larger images will require (much) more than 5.1 GB. In my case,
         | a target resolution of 768x384 (landscape) with a batch size of
         | 1 will max out my 12GB card, an RTX3080Ti.
        
           | mdorazio wrote:
           | I think this is a good time to ask if anyone is working on
           | parallelizing machine learning compute anymore? For at-home
           | computation like this it seems like it would be a lot better
           | to allow people to stack a few cheaper GPUs rather than
           | having to pony up thousands of dollars for ML-oriented beast
           | cards to be able to do things like generate large images.
        
             | andybak wrote:
             | AI upscaling will solve everything ;)
             | 
             | I've generated some remarkably good-looking print quality
             | images by upscaling 512x512 sources
        
               | bambax wrote:
               | What do you use for upscaling? Standard software like
               | Photoshop or Affinity, etc. or more dedicated software?
               | Any recommendations for options, etc.?
        
               | andybak wrote:
               | Recently impressed by
               | https://replicate.com/nightmareai/latent-sr but otherwise
               | - Cupscale
        
               | andybak wrote:
               | Here's a particularly impressive result I got from the
               | former (considering it's not especially optimized for
               | "vector art" enlargement): https://twitter.com/andybak/st
               | atus/1558737805546749953?s=20&...
        
               | glenneroo wrote:
               | For videos in particular, if you don't mind shelling out
               | cash, the current go-to (at least according to various AI
               | discord servers I'm on) for AI animation nerds is
               | currently Topaz upscaler. There are free alternatives but
               | I've yet to see any of them work as well as Topaz, though
               | I'm sure that will change soon. For interpolating frames
               | Flowframes is "free" (new features if you join the
               | Patreon) and is IMO very good.
               | 
               | I've seen a number of 80s/90s VHS recordings of concerts
               | being uploaded to YouTube in 4K (using Topaz) and they
               | look like they were recorded that way, truly amazing. I
               | do hear it can be a bit of work though getting the
               | settings right.
        
       | vanadium1st wrote:
       | Stable Diffusion is mind-blowingly good at some things. If you
       | are looking for modern artistic illustrations (like the stuff
       | that you would find on the front page of Artstation) - it's state
       | of the art, better in my opinion then Dalle-2 and Midjourney.
       | 
       | But, the interesting thing is that while it is so good in
       | producing detailed artworks and matching the styles of popular
       | artists, it's surprisingly weak at other things, like
       | interpreting complex original prompts. We've all seen the meme
       | pictures made in Craiyon (previously Dalle-mini) of photoshop-
       | collage-like visual jokes. Stable Diffusion with all its
       | sophistication is much worse at those and is struggling to
       | interpret a lot of prompts that the free and public Craiyon is
       | great with. The compositions are worse, it misses a lot of
       | requested objects or even misses the idea entirely.
       | 
       | Also as good as it is at complex artistic illustrations, it is as
       | bad at minimalistic and simple ones, like logos and icons. I am a
       | logo designer and I am already using AI a lot to produce sketches
       | and ideas for commercial logos, and right now the free and
       | publicly available Craiyon is head and shoulders better at that
       | then Stable Diffusion.
       | 
       | Maybe in the future we will have a universal winner AI that is
       | the best at any style of pictures that you can imagine. But right
       | now we have an interesting competition when different AI have
       | surprising strengths and weaknesses and there's a lot of reason
       | in trying them all.
        
         | russdill wrote:
         | Just think where we'll be two more papers down the line
        
           | elil17 wrote:
           | For those unaware this is a catchphrase of Dr. Karoly Feher
           | from the absolutely wonderful YouTube channel "Two Minute
           | Papers" which focuses on advances in computer graphics and
           | AI.
        
             | pizza wrote:
             | > Now _squeeeze_ those papers!
        
             | PoignardAzur wrote:
             | Random rant: it feels like over time Two Minutes Paper has
             | started to lean more and more into its catchphrases and
             | gimmicks, while the density of interesting content keeps
             | decreasing.
             | 
             | The whole "we're all fellow scholars here" bit feels like
             | I'm watching a kid's show about science vulgarization,
             | patting me on the head for being here.
             | 
             | "Look how smart you are, we're doing science!"
             | 
             | I dunno. I like the channel for what it is (a vulgarization
             | newsletter for cool ML developments) but sometimes the
             | author feels really patronizing / full of himself.
        
         | andybak wrote:
         | I have a penchant for wanting to make technically "bad" or
         | heavily stylized photos - and Stable Diffusion is pretty poor
         | at those. There's very little good bokeh or tilt shift stuff
         | and CCTV/Trailcam doesn't come out too well.
         | 
         | In fact Dall-E isn't as impressive for some styles as "older"
         | models (Jax/Latent Diffusion etc)
        
         | babypuncher wrote:
         | That is really a shame, because all I really want is a version
         | of Craiyon that I can modify and run on my own hardware.
         | 
         | The amount of enjoyment I have derived from playing with
         | Craiyon over the last two months is ridiculous.
        
           | culi wrote:
           | Have you checked out MidJourney? Makes Craiyon look like
           | crayons :P
        
             | glenneroo wrote:
             | Craiyon is free, whereas Midjourney is not. If you want MJ
             | level quality, check out Disco Diffusion or go straight to
             | Visions of Chaos, which runs just about every AI diffusion
             | script in existence. The dev is very active and adds new
             | features every couple days, such as recently the ability to
             | train your own diffusion models, which I've been doing the
             | last 3 days nonstop on my little 3060 Ti (8GB VRAM, which
             | is barely sufficient to run at mostly default settings).
        
               | culi wrote:
               | MidJourney does give you 25 minutes of free compute time
               | though. Which is enough for at least trying it ~40 times
               | 
               | I've checked out Disco Diffusion but hadn't heard of
               | Visions of Chaos, thanks. The biggest shortcoming to DD
               | is there's simply not yet a sufficiently trained model to
               | produce stuff to the level of MidJourney or Craiyon
        
         | GaggiX wrote:
         | it's surprisingly weak at interpreting complex original prompts
         | because the model is really small, the text encoder is just
         | 183M parameters. Craiyon is much larger.
        
       ___________________________________________________________________
       (page generated 2022-08-15 23:01 UTC)