[HN Gopher] Open-source rival for OpenAI's DALL-E runs on your g...
___________________________________________________________________
Open-source rival for OpenAI's DALL-E runs on your graphics card
Author : hardmaru
Score : 215 points
Date : 2022-08-15 15:49 UTC (7 hours ago)
(HTM) web link (mixed-news.com)
(TXT) w3m dump (mixed-news.com)
| humanistbot wrote:
| If anyone from Stability is reading, the confirmation e-mail to
| sign up is sending a broken link:
|
| "We couldn't process your request at this time. Please try again
| later. If you are seeing this message repeatedly, please contact
| Support with the following information:
|
| ip: XXXX
|
| date: Mon Aug 15 2022 XX:XX:XX GMT-0700 (Pacific Daylight Time)
|
| url: https://stability.us18.list-manage.com/subscribe/confirm"
| wccrawford wrote:
| It worked for me just now, so maybe it was temporary, or they
| already fixed it?
| hifikuno wrote:
| I also had this response.
| kgc wrote:
| Does this work on Apple silicon processors? They have plenty of
| RAM accessible to the GPU.
| sroussey wrote:
| The articles says it will, but that it is not using the GPU
| unfortunately.
| axg11 wrote:
| I'm excited for the coming race to improve and miniaturise this
| tech. Apple has a great track record of making ML models light
| enough to run locally. There will come a day when photorealistic
| image generation can run on an iPhone.
| andrewacove wrote:
| Maybe this is their long term plan for getting rid of the
| camera bump.
| humanistbot wrote:
| Can someone tell me how this compares to the guide and repo
| shared a few days ago on HN:
| https://news.ycombinator.com/item?id=32384646
| Geee wrote:
| There's also Disco Diffusion:
| https://www.reddit.com/r/DiscoDiffusion/
|
| Not sure how they compare. DD seems to be quite popular. I'm
| currently setting up DD locally.
| glenneroo wrote:
| I've been running DD for a few months now... I tend to just
| edit the python script or use e.g. entmike's fork which can
| read config files to make changes to the 50+ parameters
| (basically everything is better than having to use Jupyter
| Notebooks IMO), granted if you don't have a GPU with 6+ GB of
| VRAM, you can often get a decent enough GPU for free from
| Google Colab. For running locally, I can also highly
| recommend Visions of Chaos, which includes multiple
| versions/forks of Disco Diffusion, as well as a ton of other
| latent diffusion scripts, not to mention many many many other
| generation features such as fractals and even music. They
| also recently added the ability to train your own diffusion
| models which I've been doing the last few days using
| thousands of my own photographs. It also has a pretty nice
| GUI and the dev is extremely responsive on Discord. Also
| after you do the setup for VoC it handles running all the
| python venv setup stuff otherwise necessary with local DD
| installs. In any case, check out DD Discord and/or VoC
| discord for lots of info, tips, help, examples, and support.
| Geee wrote:
| Thanks for the info. It is possible to do something like
| transfer learning on top of existing models or do you train
| your own models from scratch? I'll check out that Vision of
| Chaos thing. I'm just beginning my journey into this
| generative art stuff and just basically trying to get this
| running right now.
| Voloskaya wrote:
| This version is a bit more optimized, and better packaged. Also
| the model has been trained longer, so when the weights become
| publicly available the resulting quality should be much higher.
| PoignardAzur wrote:
| > _Of course, with open access and the ability to run the model
| on a widely available GPU, the opportunity for abuse increases
| dramatically._
|
| > _"A percentage of people are simply unpleasant and weird, but
| that's humanity," Mostaque said. "Indeed, it is our belief this
| technology will be prevalent, and the paternalistic and somewhat
| condescending attitude of many AI aficionados is misguided in not
| trusting society."_
|
| Holy shit.
|
| On the one hand, I'm super excited by this technology, and the
| novel applications that will become possible with these open-
| source models (stuff that would never be usable if Google and
| OpenAI had a monopoly on image generation).
|
| On the other hand, I really really _really_ hope Bostrom 's
| urn[0] has no black ball in it, because we as a society seem to
| be rushing to extract as many balls as possible over increasingly
| short timescales.
|
| [0] https://nickbostrom.com/papers/vulnerable.pdf
| mortenjorck wrote:
| The length of the democratization cycle we're seeing - months
| to weeks between a breakthrough model and a competent open-
| source alternative that runs on commodity hardware - really
| highlights the genie-stuffing posture of Google and OpenAI. All
| the thoughtful, if highly paternalistic guardrails they build
| in amount to little more than fig leaves over the possible
| applications they intend to close off.
|
| I'm personally in the "AI risk is overstated" camp. But if I'm
| wrong, all the top-down AI safety in the world is going to be
| meaningless in the face of a global network of researchers,
| enthusiasts, and tinkerers.
| hedora wrote:
| They claim the guardrails are for public good, but they're
| pretty clearly using them to try to establish a competitive
| moat.
|
| It's similar to the "we don't sell personal information"
| claim. Sure, but that's because they make money renting
| malicious actors access to a black box that contains your
| personal information. Selling the contents of the box would
| reduce their overall revenue.
| TulliusCicero wrote:
| To me it seems like an obvious case of reputational risk
| being much larger for more prominent organizations than for
| smaller ones.
|
| It makes sense for Google to wait for some startup to "go
| first" in releasing a model largely without controls. That
| way, some random startup takes the initial heat of "people
| are using AI for bad things!!" headlines plastering tech
| blogs. Then Google can do basically the same thing a little
| bit later, and any attack pieces will sound old hat.
| narrator wrote:
| I was using AI Dungeon with the full power GPT-3 model before
| they crippled it. That thing had a very uninhibited mind for
| erotica! Imagine what would happen when that power comes to
| image models!
| Iv wrote:
| People will generate creepy porn and fake pictures. Humanity
| will survive this.
| forty wrote:
| Exactly, and there are many pros for humanity to this too:
| people will be able to make funny pictures and things like
| that, so it's not like it's a bad deal.
| PoignardAzur wrote:
| You're not addressing my broader point, though, just the
| easy-to-snide-at version of my point.
|
| Yes, it's pretty obvious that Dall-E and similar models won't
| destroy humanity.
|
| My point isn't that Dall-E is a black ball. My point is _we
| better hope_ a black ball doesn 't exist at all, because the
| way this is going, if it exists, we _are_ going to pick it,
| we clearly won 't be able to stop ourselves.
|
| (For the sake of dicussion, we can imagine a black ball could
| be "a ML model running on a laptop that can tell you how to
| produce an undetectable ultra-transmissible deadly virus from
| easily purchased items")
| jackblemming wrote:
| Yes, I feel much safer if OpenAI and Google are the sole
| keepers of such technology. They have my and the publics best
| interest at heart.
| PoignardAzur wrote:
| Let me put it this way: it's not great to live in a world
| where the immense majority of nukes are controlled by Donald
| Trump and Vladimir Putin.
|
| But it's arguably better than living in a world where every
| single citizen has a nuke.
|
| (Though the potential for harm of diffusion models is far
| below nuke; it's not "kill millions of people", it's "produce
| cheap disinformation and very convincing fake evidence to
| ruin someone's life")
| nightski wrote:
| I think that would be a tough argument to make (in regards
| to image generation). The same could be said of just about
| any computing technology. The problem is we lose out on a
| lot of potential good.
|
| Either way it doesn't matter, you can't control bits like
| you can enriched uranium. It's just a matter of time. In
| the grand scheme of things Open AI will be irrelevant.
| geraldwhen wrote:
| Is this satire?
| pizza wrote:
| Yes
| ljlolel wrote:
| Googlers and OpenAI legitimately believe this
| CM30 wrote:
| I don't see why this is incorrect. It seems ever since DALL-E,
| Midjourney, caught on, it seems like we've got more and more
| people trying to 'filter out' incorrect uses of their software
| under the assumption people cannot be trusted to just use it
| for whatever they want.
|
| And it depresses me, because well... imagine if other pieces of
| tech were treated this way. If the internet or crypto or
| computers or whatever were heavily limited/restricted so the
| 'wrong people' couldn't use them for bad things. We'd consider
| it ridiculous, yet it's somehow accepted for these image
| generation systems.
| dinosaurdynasty wrote:
| Nuclear tech is treated this way (much stricter even).
| CM30 wrote:
| I think there might be at least a small difference between
| nuclear tech and image generation, at least as far as the
| effects that could happen if it goes wrong.
| manquer wrote:
| Wouldn't nuclear weapons or even plastics be a black ball
| already?
|
| Humanity is not homogeneous, we will always react to new
| inventions or tools differently , many will use it positively
| some won't . Short of weapons of mass destruction I am not sure
| anything else will destroy civilization itself .
| ALittleLight wrote:
| No, the black ball is a technology that, once invented,
| humanity cannot survive. Nuclear weapons have been invented
| and humanity is surviving. Same with plastics.
|
| A black ball would be like - suppose nuclear weapons ignited
| the atmosphere. We test the first nuke, it ignites the
| atmosphere, a global fire storm consumes all breathable
| oxygen, kills all plants and everyone on the surface and
| everything else suffocates shortly after. Plastics aren't
| even close to this level of harm.
| neuronic wrote:
| > Same with plastics
|
| Plastics are playing the long game. They have to turn into
| micro- and nanoplastics first and may then enact undesired,
| unforeseen biological functions, just like BPA [1].
|
| Not even talking about weaponizing this stuff...
|
| [1] https://pubmed.ncbi.nlm.nih.gov/21605673/
| ALittleLight wrote:
| It seems implausible to me that plastics are going to
| kill all humanity. If the paper you linked makes that
| case, then I will read it, but I didn't get that from
| skimming the abstract.
| michaelt wrote:
| I guess people are getting confused because in terms of
| risk-of-destroying-humanity, nuclear weapons seem higher
| risk than DALL-E.
| PoignardAzur wrote:
| Nuclear weapons alone aren't an existential risk. There
| are far fewer nuclear weapons in the world than there are
| major cities, among other things.
|
| Dall-E isn't an x-risk, but an advanced AI might be
| (though _a lot_ of people have their opinion on that
| part).
| tablespoon wrote:
| >> Wouldn't nuclear weapons or even plastics be a black
| ball already?
|
| > No, the black ball is a technology that, once invented,
| humanity cannot survive. Nuclear weapons have been invented
| and humanity is surviving. Same with plastics.
|
| I don't think that definition is a good one. Technological
| civilization [1] has survived nuclear weapons for ~80
| years, but there's no guarantee it will survive it for
| another 80 years, let alone forever. It seems like these
| "black balls" should be though of like time bombs, there
| are at least two variables: how much destruction it will
| cause when it goes off _AND_ the delay time before that
| happens. We shouldn 't confuse a dangerous technology with
| a long delay time for a safe technology. My intuition tells
| me that there will probably be nuclear war at some point
| over the next 1,000+ years.
|
| [1] I don't think nuclear weapons can make humanity
| extinct, so long as there are still little poorly-connected
| subsistence communities in remote areas. However, if The
| Market, manages to extend its tentacles into every human
| community, we're probably fucked.
| ALittleLight wrote:
| The term comes from Nick Bostrum's article on The
| Vulnerable World [1]. In it he defines a black ball like
| this "a black ball: a technology that invariably or by
| default destroys the civilization that invents it".
| Nuclear weapons don't invariably destroy civilization
| because we could just imagine that we keep using them as
| is - that's not impossible. Also, Bostrum considers
| nuclear weapons explicitly and calls them a gray ball.
|
| 1 - https://nickbostrom.com/papers/vulnerable.pdf
| krono wrote:
| Either we equalise chaos or we reduce chaos, there exist no
| other options for entropy incarnate.
| sinenomine wrote:
| What if the black ball was a red herring all along and the
| usual suspect tech-CEO's hand(s) rushing to control the said
| crystal ball are the real hazard?
| dustingetz wrote:
| doesn't build on my mac studio due to a dependency whose mac
| version is two major versions behind
| upupandup wrote:
| My friend wants to know when she can use this to generate porn,
| are we close?
| planetsprite wrote:
| You'll need to train your own model, though I'm sure if someone
| manages to crowdsource it there's a very obvious economic
| incentive.
| Mountain_Skies wrote:
| Is there a way to make money selling the model to people who
| want to use it to make porn? If so, it will trickle down
| relatively quickly. If not, it'll still eventually trickle down
| but will take longer.
| TigeriusKirk wrote:
| Surely you could sell custom prompt runs for porn for a great
| deal more than OpenAI is charging for generalist custom
| prompts.
|
| Making money at it should be easy, and places like PornHub
| wouldn't care about any outrage. The real challenge would be
| limiting criminal and civil liability, at least to my not-in-
| the-business thinking.
| upupandup wrote:
| This is already a thing in the kpop fake porn industry. I
| don't know how Patreon/Onlfans are allowing this to happen, I
| mean it's a travesty that highly suggestive lyrics and
| stripper dance moves in scantily clad kpop idols are being
| used for sexual gratification
| TaylorAlexander wrote:
| Once training can be done on a beefy home rig folks will be
| all over it.
| _blop wrote:
| This article hints that Stable Diffusion can at least generate
| normal looking nude women:
| https://techcrunch.com/2022/08/12/a-startup-wants-to-democra...
|
| There are attempts to gather porn images and train or fine-tune
| existing networks on it, here's a recent attempt by an art
| student mentioned in the article above (NSFW!!):
| https://www.vice.com/en/article/m7ggqq/this-furry-porn-ai-ge...
| isoprophlex wrote:
| Jesus H Christ those are some seriously cursed hindquarters
| SV_BubbleTime wrote:
| I was a better person this morning for not knowing that
| furries had the term "hindquarters". I mean, that's fine
| for other people, you do you, but for me, I was better this
| morning.
| isoprophlex wrote:
| You'll have to train it on your own data. as others have
| mentioned the training data for dall-e, stable diffusion etc
| has been cleaned prior to training.
|
| However, if it is possible to re-start the training process
| from the weights of a non-sexually aware model, this finetuning
| might not take all that long..!
| GaggiX wrote:
| It is possible to generate nudes, but not pornographic ones.
| TulliusCicero wrote:
| Props for just coming out and saying it
| Voloskaya wrote:
| The data used to train those model is specifically filtered to
| remove sexual content, so the model can't generate porn because
| it has no idea what it looks like, beyond a few samples that
| made it past the filter.
|
| So no, your "friend" can't use it for that.
| [deleted]
| upupandup wrote:
| Why is it that sexual content is so frowned upon in this
| space? If it's a content publishing platform I would
| understand that advertisers don't want that, but this is
| literally dictating people what is bad and good. I just don't
| understand this Puritan outrage with text-to-image porn
| generation.
| blowski wrote:
| I'd guess that, for general purpose companies, it's an area
| full of legal ambiguity and potential for media outrage, so
| just not worth the risk. However, given the evidence of
| human history, it's certain that someone with an appetite
| for exploiting this niche will develop exactly that kind of
| tool.
| spywaregorilla wrote:
| Because it's a lot more annoying for your innocuous content
| to be rendered as porn when the ai happens to interpret it
| that way than it is for you to be unable to render your
| pervy desires intentionally.
|
| A porn model should really be it's own thing.
| Voloskaya wrote:
| Because you can't control what the model is going to output
| in response to a query. The model is trained to respond in
| a way that is aligned but there is no guarantee.
|
| Since we certainly don't want to show generated image of
| porn or violence to someone that didn't specifically ask
| for that, the easiest way to ensure that's not going to
| happen is to just not train on that kind of data in the
| first place. The worst that can happen with a model trained
| on "safe" images is that the image is irrelevant or makes
| no sense, meaning you could deploy systems with no human
| curator on the other end, and nothing bad is going to
| happen. You lose that ability as soon as you integrate
| porn.
|
| Also with techniques like in-painting, the potential for
| misuse of a model trained on porn/violence would be pretty
| terrifying.
|
| So the benefits of training on porn seems very small
| compared to the inconvenience. I don't think it's anything
| to do with puritanism, it's just that if I am the one
| putting dollars and time to train such a model I am
| certainly not going to be taking on the added complexity
| and implications of dealing with porn to just to make a few
| people realize their fetishes at the risk of my entire
| model being undeployable because it's outputting too much
| porn or violence.
| upupandup wrote:
| > porn/violence would be pretty terrifying.
|
| uh have you seen American/European mainstream
| pornography? it's already pretty violent (ex. face
| slapping, choking, kicking, extreme bdsm).
|
| I just don't see why this stuff is allowed and protected
| by the law (if its not recorded and published its
| illegal) and then we are suddenly concerned about what
| text can do.
|
| Just one of the many double standards I see in Western
| society.
| alach11 wrote:
| One example risk is someone using computer-generated
| content to extort money, demand ransom, etc. The cheaper
| and easier this becomes, the more likely it is to be
| weaponized at scale.
| upupandup wrote:
| but wouldn't the ability to auto-generate blackmail
| material mean the value of blackmail would fall? Just
| from a supply and demand perspective, it makes sense to
| me why a deepfaked _kompromat_ would put serious discount
| on such material especially if everybody knows it was
| generated by an AI.
|
| Someone like Trump would just shrug and say the pee tapes
| are deepfaked. I don't think its possible for AI to
| bypass forensics either. So again this narrative that
| "deepfake blackmail" would be dangerous makes no sense.
| Voloskaya wrote:
| > uh have you seen American/European mainstream
| pornography? it's already pretty violent
|
| That's not at all what I am talking about. What I am
| saying is that such a model would give everyone the
| ability to create extremely realistic fake images of
| someone else within a sexual/violent context, in one
| click, thanks to inpainting. This can become a
| hate/blackmail machine very fast.
|
| Even though Dalle-2 is not trained on violence/porn it
| still forbids inpainting pictures with realistic faces
| that have been uploaded by users to prevent abuse, so now
| imagine the potential with a model trained on
| porn/violence.
|
| Someone is eventually going to do it, but back to your
| initial question about why it's still not done yet, I
| believe it's because most people would rather not be that
| someone.
| gs17 wrote:
| I imagine a large part of it is that it could generate
| photorealistic child porn (also "deepfake" porn of real
| people) and there's not really a good way to prevent it
| entirely while also allowing generalized sexual content
| AFAIK. There's probably some debate on how big a problem
| this really is, but no one wants their system to be the one
| with news stories about how it's popular with pedophiles.
| It was the issue they had with AI Dungeon.
| pixl97 wrote:
| Correct me if I'm not wrong, but in many countries even
| simulated child porn is illegal. A model spitting that
| out could be legally problematic.
| tnzk wrote:
| Do they remove certain political or religious ideas which
| is considered illegal in somewhere as well?
| djbebs wrote:
| Because the law makes it very difficult to provide such
| services in the spirit of preventing the exploitation of
| minors.
|
| Make no mistake this is an indirectly legal hurdle.
| GistNoesis wrote:
| I did a show HN about this
| https://news.ycombinator.com/item?id=31900095 a month ago, to
| experiment with the technology. The training was done in a
| week-end only, with 2 old gpus (1080ti).
|
| Currently waiting to scale-up for improve quality mainly for
| economic reasons, not quite sure I could recoup the training
| costs yet. Even more so if I go with cloud training.
|
| NVidia will release the 4090 in september, and ethereum may do
| "the merge" that will make GPU useless for mining so GPU price
| could be affordable so I can update my home cluster with
| affordable 3090s. (But electricity prices are also up).
|
| Also there are new algorithms every month like the stable
| diffusion, that would obsolete your previous training.
|
| The video generation cost is probably still too expensive
| compared to just paying a cam girl in a low wage country. But
| it will probably go down soon.
|
| This is also some sensitive data, as plagued with copyright
| issues, so it's quite troublesome to legally share training
| datasets to share costs.
|
| It also has its own challenges with respect to custom dataset
| creation with text description, so it's probably a better idea
| to adapt the algorithm to the currently available data to keep
| the costs low.
|
| Finally once someone releases a model, in the next month there
| will be at least 3 clones.
|
| There is also the problem to find an adult friendly payment
| processor.
|
| And the multitude of potential legal issues.
|
| But it's probably inevitable.
| colordrops wrote:
| Yet another "open" model that isn't open. We shall see if they
| actually do release to the public. We keep seeing promises from
| various orgs but it never pans out.
| TulliusCicero wrote:
| Their plan seems less hand wavey, they're being explicit with
| "first we release it like this, then like that, then freely to
| everyone".
|
| You're right that they could always change their minds and that
| would suck, but so far they seem to be being up front.
| dang wrote:
| Recent and related:
|
| _Stable Diffusion launch announcement_ -
| https://news.ycombinator.com/item?id=32414811 - Aug 2022 (37
| comments)
| 999900000999 wrote:
| Has anyone made a pixel art generator, that can create the
| animation sprites ?
| gxqoz wrote:
| You can use DALL-E and other models to make pixel art ("as
| pixel art"), although it can both be overkill and hard to get
| consistent results that you'd put into animation. I'm guessing
| that starting from more of a video model and then converting to
| pixel art could be better. Although it's also non-trivial to
| turn "realistic" video into convincing animation.
| 999900000999 wrote:
| I pay good money for a specialized machine learning algorithm
| that can take a pixel art character, and then generate all
| the animated sprites for it.
|
| I actually tried to get Dalle to do this, And it made like
| three good sprites in the rest were just broken. But it was
| so strange, because you could see it was still organized as a
| sprite sheet, it's just the sprites were useless.
|
| I think the practical applications of this technology will be
| hyper specialized models for specific purposes.
| 0xdead1eaf wrote:
| Check out NUWA-Infinity[0][1], submitted to arxiv jul 20, 2022.
| It captures artistic style very well (though can't speak to the
| quality of the pixel art it would generate) and can do image to
| video.
|
| [0] https://nuwa-infinity.microsoft.com/#/ [1]
| https://arxiv.org/abs/2207.09814
| tckerr wrote:
| _w1kke_ wrote:
| @KaliYuga did - she got hired by StabilityAI just a few days
| ago. Here is a link to the Pixel Art Diffusion notebook:
|
| https://colab.research.google.com/github/KaliYuga-ai/Pixel-A...
| stuckinhell wrote:
| This is pretty amazing, anyone have any tips on building a pc for
| machine learning with a RAID device ?
| cellis wrote:
| Look into building an ethereum mining machine... it can double
| as an ML workstation. That's what I did.
| hedora wrote:
| If you just want to try it out, consider using a remote cad
| workstation from a company like paperspace.
|
| (No affiliation.)
| jessfyi wrote:
| Hasn't been updated since 2020, but Tim Dettmer's guide [0] is
| pretty much the gold standard for optimizing what to buy for
| which area of DL/ML you're interested in. The pricing has
| changed thanks to GPU prices coming back down to earth a bit,
| but what to look out/how much ram you need for which task
| hasn't. Check out the "TL;DR advice" section then scroll back
| up for detail info on _why_ and common misconceptions. For the
| tips on a RAID /NAS setup alongside it, just head to the
| datahoarders subreddit and their FAQ.
|
| [0] https://timdettmers.com/2020/09/07/which-gpu-for-deep-
| learni...
| fswd wrote:
| Unfortunately it's a commercial license and the model isn't
| available to the public so it isn't very useful.
| AgentME wrote:
| Isn't that just temporary until the public release? Or is the
| article misleading by calling it open source?
| andybak wrote:
| It's going to be MIT from what I have heard. On phone atm so
| can't provide sources.
| fswd wrote:
| https://stability.ai/research-access-form
|
| So far, I haven't got a response but it's a Monday
|
| Here's the license
|
| https://github.com/CompVis/stable-
| diffusion/blob/main/LICENS...
| andybak wrote:
| That's just a restricted interim release. The proper public
| release isn't ready yet. No timescale but sounds like
| days/weeks rather than months/years.
| fswd wrote:
| like OpenAI?
| andybak wrote:
| Not sure I follow. OpenAI are not claiming they are going
| to release their model at all. The team behind Stable
| Diffusion have currently kept every promise they've made.
|
| (And if you're insinuating something, just come out and
| say it so people can engage appropriately)
| ruuda wrote:
| The site shows a notification in German that I need to enable
| JavaScript to use the site, after the first paragraph. But then
| after that is the full article, including images, which is almost
| perfectly readable, except it's at 5% opacity (or maybe the
| JavaScript popup is 95% opacity overlaid on the article), which
| makes it impossible to read again. :'(
| [deleted]
| thorum wrote:
| If you want to see more examples of what this AI is capable of,
| check out the subreddit:
|
| https://reddit.com/r/stablediffusion
| unethical_ban wrote:
| Wait, so the closed source generator known as DALL-E is owned by
| a company called OpenAI?
| alephxyz wrote:
| It's a bit of a dead horse at this point but yes. See the
| previous discussion:
| https://news.ycombinator.com/item?id=28416997
| fariszr wrote:
| As Elon said "OpenAI should be more Open IMO"
| glenneroo wrote:
| Curiously it seemed to lock down even more after they
| "partnered" with Microsoft.
| keepquestioning wrote:
| We are heading into uncharted territory :(
| belltaco wrote:
| Article says it needs 5.1GB of Graphics RAM.
|
| Does any one know how much data download and disk storage does it
| need?
| luismmolina wrote:
| If you read directly from the site. The requirements for the
| graphic card are 10 VRAM as a minimum. Because it's ruins
| locally you don't need to download anything apart from the
| initial model, this applies to the disk space too.
| _blop wrote:
| The v1.3 model weighs in at 4.3 GB. There's an additional
| download of 1.6 GB of other models due to usage of
| huggingface's transformers (only once on startup). And the
| conda env takes another 6 GBs due to pytorch and cuda.
|
| Larger images will require (much) more than 5.1 GB. In my case,
| a target resolution of 768x384 (landscape) with a batch size of
| 1 will max out my 12GB card, an RTX3080Ti.
| mdorazio wrote:
| I think this is a good time to ask if anyone is working on
| parallelizing machine learning compute anymore? For at-home
| computation like this it seems like it would be a lot better
| to allow people to stack a few cheaper GPUs rather than
| having to pony up thousands of dollars for ML-oriented beast
| cards to be able to do things like generate large images.
| andybak wrote:
| AI upscaling will solve everything ;)
|
| I've generated some remarkably good-looking print quality
| images by upscaling 512x512 sources
| bambax wrote:
| What do you use for upscaling? Standard software like
| Photoshop or Affinity, etc. or more dedicated software?
| Any recommendations for options, etc.?
| andybak wrote:
| Recently impressed by
| https://replicate.com/nightmareai/latent-sr but otherwise
| - Cupscale
| andybak wrote:
| Here's a particularly impressive result I got from the
| former (considering it's not especially optimized for
| "vector art" enlargement): https://twitter.com/andybak/st
| atus/1558737805546749953?s=20&...
| glenneroo wrote:
| For videos in particular, if you don't mind shelling out
| cash, the current go-to (at least according to various AI
| discord servers I'm on) for AI animation nerds is
| currently Topaz upscaler. There are free alternatives but
| I've yet to see any of them work as well as Topaz, though
| I'm sure that will change soon. For interpolating frames
| Flowframes is "free" (new features if you join the
| Patreon) and is IMO very good.
|
| I've seen a number of 80s/90s VHS recordings of concerts
| being uploaded to YouTube in 4K (using Topaz) and they
| look like they were recorded that way, truly amazing. I
| do hear it can be a bit of work though getting the
| settings right.
| vanadium1st wrote:
| Stable Diffusion is mind-blowingly good at some things. If you
| are looking for modern artistic illustrations (like the stuff
| that you would find on the front page of Artstation) - it's state
| of the art, better in my opinion then Dalle-2 and Midjourney.
|
| But, the interesting thing is that while it is so good in
| producing detailed artworks and matching the styles of popular
| artists, it's surprisingly weak at other things, like
| interpreting complex original prompts. We've all seen the meme
| pictures made in Craiyon (previously Dalle-mini) of photoshop-
| collage-like visual jokes. Stable Diffusion with all its
| sophistication is much worse at those and is struggling to
| interpret a lot of prompts that the free and public Craiyon is
| great with. The compositions are worse, it misses a lot of
| requested objects or even misses the idea entirely.
|
| Also as good as it is at complex artistic illustrations, it is as
| bad at minimalistic and simple ones, like logos and icons. I am a
| logo designer and I am already using AI a lot to produce sketches
| and ideas for commercial logos, and right now the free and
| publicly available Craiyon is head and shoulders better at that
| then Stable Diffusion.
|
| Maybe in the future we will have a universal winner AI that is
| the best at any style of pictures that you can imagine. But right
| now we have an interesting competition when different AI have
| surprising strengths and weaknesses and there's a lot of reason
| in trying them all.
| russdill wrote:
| Just think where we'll be two more papers down the line
| elil17 wrote:
| For those unaware this is a catchphrase of Dr. Karoly Feher
| from the absolutely wonderful YouTube channel "Two Minute
| Papers" which focuses on advances in computer graphics and
| AI.
| pizza wrote:
| > Now _squeeeze_ those papers!
| PoignardAzur wrote:
| Random rant: it feels like over time Two Minutes Paper has
| started to lean more and more into its catchphrases and
| gimmicks, while the density of interesting content keeps
| decreasing.
|
| The whole "we're all fellow scholars here" bit feels like
| I'm watching a kid's show about science vulgarization,
| patting me on the head for being here.
|
| "Look how smart you are, we're doing science!"
|
| I dunno. I like the channel for what it is (a vulgarization
| newsletter for cool ML developments) but sometimes the
| author feels really patronizing / full of himself.
| andybak wrote:
| I have a penchant for wanting to make technically "bad" or
| heavily stylized photos - and Stable Diffusion is pretty poor
| at those. There's very little good bokeh or tilt shift stuff
| and CCTV/Trailcam doesn't come out too well.
|
| In fact Dall-E isn't as impressive for some styles as "older"
| models (Jax/Latent Diffusion etc)
| babypuncher wrote:
| That is really a shame, because all I really want is a version
| of Craiyon that I can modify and run on my own hardware.
|
| The amount of enjoyment I have derived from playing with
| Craiyon over the last two months is ridiculous.
| culi wrote:
| Have you checked out MidJourney? Makes Craiyon look like
| crayons :P
| glenneroo wrote:
| Craiyon is free, whereas Midjourney is not. If you want MJ
| level quality, check out Disco Diffusion or go straight to
| Visions of Chaos, which runs just about every AI diffusion
| script in existence. The dev is very active and adds new
| features every couple days, such as recently the ability to
| train your own diffusion models, which I've been doing the
| last 3 days nonstop on my little 3060 Ti (8GB VRAM, which
| is barely sufficient to run at mostly default settings).
| culi wrote:
| MidJourney does give you 25 minutes of free compute time
| though. Which is enough for at least trying it ~40 times
|
| I've checked out Disco Diffusion but hadn't heard of
| Visions of Chaos, thanks. The biggest shortcoming to DD
| is there's simply not yet a sufficiently trained model to
| produce stuff to the level of MidJourney or Craiyon
| GaggiX wrote:
| it's surprisingly weak at interpreting complex original prompts
| because the model is really small, the text encoder is just
| 183M parameters. Craiyon is much larger.
___________________________________________________________________
(page generated 2022-08-15 23:01 UTC)