[HN Gopher] I used Stable Diffusion and Dreambooth to create an ...
___________________________________________________________________
I used Stable Diffusion and Dreambooth to create an art portrait of
my dog
Author : jakedahn
Score : 295 points
Date : 2023-04-16 18:29 UTC (4 hours ago)
(HTM) web link (www.shruggingface.com)
(TXT) w3m dump (www.shruggingface.com)
| cogitoergofutuo wrote:
| This is really interesting. I do wish the author included the
| cost to train the model from replicate though.
| wincy wrote:
| He mentions the Colab for Dreambooth, that only takes ten minutes
| or so to train using an A100 (the premium GPU) and you can have
| it turn off after it finishes, and saves to Google Drive. Super
| easy.
| jakedahn wrote:
| Yeah!
|
| Here's the colab notebook, in case anyone is interested:
| https://github.com/TheLastBen/fast-stable-diffusion
|
| I've trained a few smaller models using their Dreambooth
| notebook, but I think for 4000 training steps, an A100 will
| usually take 30-40min. I believe replicate also uses A100s for
| their dreambooth training jobs.
| [deleted]
| sinman wrote:
| I did something loosely related. As a present for my girlfriend's
| birthday, I made her a "90s website" with AI portraits of her
| dog: https://simoninman.github.io/
|
| It wasn't actually particularly hard - I used a Colab notebook on
| the free tier to fine-tune the model, and even got chatGPT to
| write some of the prompts.
| jakedahn wrote:
| hah, these are pretty cool! Well done!
| AuryGlenz wrote:
| In my (limited) experience, dogs seem to be easier than people
| for fine-tuning - especially if your end result is going to be
| artsy. Faces of people you know well being off in slight ways
| really throws you off, but with dogs there's a bit more leeway.
| amelius wrote:
| But why pick a dog as an example?
|
| Humans are much worse in telling dogs apart than other humans
| (except perhaps the owner of the particular dog).
|
| So for all we know, the AI didn't generate a portrait of this
| particular dog but instead a generic picture of this breed of
| dog.
| chipgap98 wrote:
| Because you invent a new word when you train dreambooth and
| teach it that your subject is an example of that word. The fact
| that the word you've created returns photos similar to subject
| is a sign that it worked.
| amelius wrote:
| I suppose that dreambooth is pretrained on a large dataset
| that includes many different dogs.
|
| My point is that it is difficult to judge (for us) that the
| returned photos are actually similar to the subject.
| ModernMech wrote:
| The paper shows dogs with very distinctive fur coloring.
| Particularly the corgi with a white strip between its eyes.
| I think the paper would be completely fraudulent if this
| dog were also featured heavily in the training set. So the
| point is the white stripe corgi isn't in the set, and with
| a few examples, the model could then generate brand new
| images of corgis with a similar fur pattern. Maybe all it
| can do is fur patterns but it's a start.
| jakedahn wrote:
| Mostly because I thought of it more as an art project than a
| technical accuracy project. However, the honest answer to your
| question, is because I have a ridiculous amount of photos of my
| dog on my phone . Getting training data is hard work.
|
| But this is totally true, I found that maybe 30% of the images
| I generated did not look like my dog at all. However the rest
| do a good job at capturing his eyes and facial expressions that
| he actually makes. I thought that the chosen image I worked
| from captured the look of his eyes super well.
|
| But yeah, nobody but me would really appreciate that.
| AuryGlenz wrote:
| I linked this elsewhere but here are Pokemon image generations
| of my (mutt) dog: https://imgur.com/a/11OxoSA
|
| She's pretty unique looking and it comes through even with
| heavy styling.
| asadlionpk wrote:
| If anyone wants to try Dreambooth online, I made a free website
| for this: https://trainengine.ai
| itronitron wrote:
| [flagged]
| steve_adams_86 wrote:
| If they like it, then it's not garbage for them. Mission
| accomplished. What you think of art on a stranger's wall isn't
| really the point -- it's more so about the technology behind
| it.
|
| I suppose you could be indirectly commenting about how you
| think the technology does a bad job generating art, but there
| are better ways to say it.
| spikej wrote:
| They like it. And it was a good excuse to work with new tech.
| Why poo poo on it?
| itronitron wrote:
| I like my comment, and it was a good excuse to work with new
| tech, why poo poo on my comment?
| mdp2021 wrote:
| Because your comment was pretty objectively inappropriate
| and improductive - gratuitous. Or did you mean something
| productive that we should have guessed?
| Fricken wrote:
| Leaving poo poo on things is a popular passtime for many dog
| people.
| mdp2021 wrote:
| 'is a popular passtime for many [] people'
|
| Fixed That For You.
|
| Interestingly for the ethologist, they have habitats: for
| example, the bottom comments in the stacks in YouTube...
| mdp2021 wrote:
| Good, so in order to produce good AI-aided graphics the
| producers will have to become _critics_ , arts experts, with
| the important side effect of personal elevation and the
| collective gain of society. "Wins" on all sides.
|
| Update: three minutes later, it seems that somebody did not get
| the irony.
| lxe wrote:
| What a useful and nuanced critique. Thanks!
| beezlewax wrote:
| Highly subjective comment. Art is not something that is either
| "good" or "not good". It can hold value to the creator
| intrinsically. Like a kids crayon scribbles.
| simonw wrote:
| I love how much work went into this.
|
| There's a great deal of pushback against AI art from the wider
| online art community at the moment, a lot of which is motivated
| by a sense of unfairness: if you're not going to put in the time
| and effort, why do you deserve to create such high equality
| imagery?
|
| (I do not share this opinion myself, but it's something I've seen
| a lot)
|
| This is another great counter-example showing how much work it
| takes to get the best, deliberate results out of these tools.
| quadcore wrote:
| _a lot of which is motivated by a sense of unfairness_
|
| Say you generate a picture with midjourney - who is/are the
| closest artist(s) you can find for that picture?
|
| Not the AI, not the prompter, so the closest artists you can
| find for that picture are the ones who made the pictures in the
| training set. So generating a picture is outright copyright
| infringement. Nothing to do with unfairness in the sense of
| "artists get out compete". Artists dont get out compete - they
| are stolen.
| ModernMech wrote:
| Typical Midjourney workflow involves constantly reprompting
| and fine tuning based on examples and input images. When you
| arrive at a given image in Midjourney, it's often impossible
| to recreate it even with the same seed. You'll need the input
| image as well, and the input image is often the result of a
| long creative process.
|
| Why is it you discount the creative input of the user? Are
| they not doing work by guiding the agent? Don't their choices
| of prompt, input image, and the refinement of subsequent
| generated images represent a creative process?
| quadcore wrote:
| I agree with you on the technicality - if we say the
| promter is an artist, then the picture belongs to him.
| quadcore wrote:
| From what I read on the internet, people assume AI generated
| art is a difficult question legaly speaking. Some literally
| assume artists complain only because there are out competed.
|
| I disagree - I think that AI generative art is an easy case of
| copyright infrigement and an easy win for a bunch of good
| lawyers.
|
| That's because you can't find an artist for a generated picture
| other than the ones in the training set. If you can't find a
| new artist, then the picture belongs to the old ones, so to
| speak. I really dont see what's difficult with that case. I
| think the internet assume a bit to quickly it's a difficult
| question and a grey area when maybe it just isnt.
|
| It's noteworthy that Adobe did things differently than the
| others and the way they did things goes in the direction im
| describing here. Maybe it's just confirmation bias.
| stavros wrote:
| I agree. This is a clear-cut case of copyright infringement,
| as is all art. After all, people painting images have only
| seen paintings other people painted.
| GuB-42 wrote:
| > That's because you can't find an artist for a generated
| picture other than the ones in the training set. If you can't
| find a new artist, then the picture belongs to the old ones,
| so to speak.
|
| It doesn't belong to the "old ones", it is at best a
| derivative work. And even writing a prompt, as trivial as it
| might seem, makes you an artist. There are modern artists
| exposing a random shit as art, and you may or may not like
| it, but they are legally artists, and it is their work.
|
| The question is about fair use. That is, are you allowed to
| use pictures in the dataset without permission. It is a
| tricky question. On one extreme, you won't be able to do
| anything withing infringing some kind of copyright. Used the
| same color as I did? I will sue you. On the other extreme,
| you essentially abolish intellectual property. Copying
| another artist style in your own work is usually fair use,
| and that's essentially what generative AI do, so I guess
| that's how it will go, but it will most likely depends on how
| judges and legislators see the thing, and different countries
| probably will have different ideas.
| circuit10 wrote:
| It's not as simple as that though because the algorithm does
| learn by itself and mostly just uses the training data to
| score itself against, it doesn't directly copy it as some
| people seem to think. It can end up learning to copy things
| if it sees them enough times though
|
| "you can't find an artist for a generated picture other than
| the ones in the training set. If you can't find a new artist,
| then the picture belongs to the old ones, so to speak"
|
| I don't think that's valid on its own as a way to completely
| discount considering how directly it's using the data. As an
| extreme example, what if I averaged all the colours in the
| training data together and used the resulting colour as the
| seed for some randomly generated fractal or something? You
| could apply the same arguments - there is no artist except
| the original ones in the training set - and yet I don't think
| any reasonable person would say that the result obviously
| belongs to every single copyright owner from the training set
| ModernMech wrote:
| But this person's dog isn't in the training set, so why
| should some artist be credited for a picture they never drew?
| Not a single person has drawn his dog before, now there is a
| drawing of his dog, and you want to credit someone who had no
| input to the creative process here?
| quadcore wrote:
| If you can find a new artist then I think the picture
| belongs to him.
| austinjp wrote:
| "Input into the creative process" is surely broader than
| simply "painted the portrait". Artists most certainly never
| consented to have their works used as training data. To
| this extent, they might be justifiably pissed off.
|
| Artists and designers have furthered their careers (and
| gained notoriety) by 'ripping off' others since the dawn of
| time. This used to require technical artistic ability; now
| less so. The barrier to entry is.... not necessarily lower
| now, but different.
| mdp2021 wrote:
| > _an artist for a generated picture_
|
| Normally - outside the specific context of AI generated art
| -, there is not a relation "work1 - past author" , but "work
| - large amount of past experience". (1"work": in the sense of
| product, output etc.)
|
| If the generative AI is badly programmed, it will copy the
| style of Smith. If properly programmed, it will "take into
| account" the style of Smith. There is a difference between
| learning and copying. Your tool can copy - if you do it
| properly, it can learn.
|
| All artists work in a way "post consideration of a finite
| number of past artists in their training set".
| brucethemoose2 wrote:
| TBH it would be much easier with more streamlined tooling,
| especially if doing it locally with lora/lycoris.
|
| Its kinda like using ffmpeg for vapoursynth for video editing
| instead of a video editing GUI.
|
| That being said the training parameter/data tuning is
| definitely an art, as is the prompting.
| asddubs wrote:
| most of the criticism I've seen is that it's all trained on
| uncompensated stolen artwork. Much like how copilot is trained
| on GPL code, disregarding its license terms.
| minimaxir wrote:
| The general argument (IANAL) is that it's Fair Use, in the
| same vein as Google Images or Internet Archive scraping and
| storing text/images. Especially since the outputs of
| generated images are not 1:1 to their source inputs, so it
| could be argued that it's a unique derivative work. The
| current lawsuits against Stability AI are testing that,
| although I am skeptical they'll succeed (one of the lawsuits
| argues that Stable Diffusion is just "lossy compression"
| which is factually and technically wrong).
|
| There is an irony, however, that many of the AI art haters
| tend to draw fanart of IP they don't own. And if Fair Use
| protections are weakened, their livelihood would be hurt far
| more than those of AI artists.
|
| The Copilot case/lawsuit IMO is stronger because the
| associated code output is a) provably verbatim and b) often
| has explicit licensing and therefore intent on its usage.
| bendmorris wrote:
| >it could be argued that it's a unique derivative work
|
| Creating a derivative work of a copyrighted image requires
| permission from the copyright holder (i.e., a license)
| which many of these services do not have. So the real
| question is whether AI-generated "art" counts as a
| derivative work of the inputs, and we just don't know yet.
|
| >b) often has explicit licensing and therefore intent on
| its usage
|
| It doesn't matter. In the absence of a license, the default
| is "you can't use this." It's not "do whatever you want
| with it." Licenses grant (limited) permission to use;
| without one you have no permission (except fair use, etc.
| which are very specifically defined.)
| adamm255 wrote:
| If a person trained themselves on the same resources, and
| picked up a brush or a camera and created some stunning
| art in a similar vein, would we look at that as a
| derivative work? Very interesting discussion. Art of all
| forms are inspired by those who came before.
|
| Inspired/trained... I think these could be seen as the
| same.
| bendmorris wrote:
| Training a human and training a model may use the same
| verb but are very different.
|
| If the person directly copied another work, that's a
| derivative work and requires a license. But if a person
| learned an abstract concept by studying art and later
| created art, it's not derivative.
|
| Computers can't learn abstract concepts. What they can do
| is break down existing images and then numerically
| combine them to produce something else. The inputs are
| directly used in the outputs. It's literally derivative,
| whether or not the courts decide it's legally so.
| [deleted]
| Ukv wrote:
| > Computers can't learn abstract concepts
|
| Goalposts can be moved on whether it has "truly learned"
| the abstract concept, but at the very least neural
| networks have the ability to work with concepts to the
| extent that you can ask to make an image more "chaotic",
| "mysterious", "peaceful", "stylized", etc. and get
| meaningfully different results.
|
| When a model like Stable Diffusion has 4.1GB of weights
| and was trained on 5 billion images, the primary impact
| of one particular training image may be very slightly
| adjusting what the model associates with "dramatic".
|
| > If the person directly copied another work, that's a
| derivative work and requires a license
|
| Not if it falls under Fair Use. Here's a fairly extreme
| example for just how much you can get away with while
| still (eventually) being ruled Fair Use:
| https://www.artnews.com/art-in-america/features/landmark-
| cop... - though I wouldn't recommend copying as much as
| Richard Prince did.
|
| > The inputs are directly used in the outputs
|
| Not "directly" - during generation, normal prompt to
| image models don't have access to existing images and
| cannot search the Internet.
| asddubs wrote:
| I don't think we should hold technology to the same
| standards as humans. I'm also allowed to memorize what
| someone said, but that doesn't mean I'm allowed to record
| someone without their knowledge (depending on the
| location)
| simonw wrote:
| "Creating a derivative work of a copyrighted image
| requires permission from the copyright holder"
|
| That's why "fair use" is the key concept here. Under US
| copyright law "fair use" does not require a license. The
| argument is that AI generated imagery qualifies as "fair
| use" - that's what's about to be tested in the courts.
|
| https://arstechnica.com/tech-policy/2023/04/stable-
| diffusion... is the best explanation I've seen of the
| legal situation as it stands.
| [deleted]
| userbinator wrote:
| AI is just showing us a fact that many are unwilling to
| admit: _everything_ is a derivative work. Much like humans
| will memorise and regurgitate what they 've seen.
| simonw wrote:
| The trained on stolen artwork critique is reasonable - I
| helped with one of the first big investigations into how that
| training data worked when Stable Diffusion first came out:
| https://simonwillison.net/2022/Sep/5/laion-aesthetics-
| weekno...
|
| It's interesting to ask people who are concerned about the
| training data what they think of Adobe Firefly, which is
| strictly trained on correctly licensed data.
|
| I'm under the impression that DALL-E itself used licensed
| data as well.
|
| I find some people are comfortable with that, but others will
| switch to different concerns - which indicates to me that
| they're actually more offended by the idea of AI-generated
| art than the specific implementation details of how it was
| trained.
| bugglebeetle wrote:
| I think the more correct argument is that Stable Diffusion
| effectively did a Napster to force artists into shit
| licensing deals with large players who can handle the
| rights management. It's unlikely that artists would've ever
| agreed to them otherwise, but since the alternative now is
| to have your work duplicated by a pirate model or legally
| gray service, what are you going to do? This seems borne
| out by the fact that Stability AI themselves are now
| retreating behind Amazon for protection.
| adamm255 wrote:
| When I did Photography at college, a lot of the work was
| looking at other works of art. I spent a lot of time in
| Google Images, diving through books from the Art section
| and going to galleries. Lots of photo copying was involved!
|
| I then did works in the style of what I'd researched. I
| trained myself on works I didn't own, and then produced my
| own.
|
| I kind of see the AI training as similar work, just done
| programmatically vs physically.
|
| Certainly a very interesting topic.
|
| I can't get my head around how far we've come on this in
| the last 6-12 months. From pretty awful outputs to works
| winning Photography awards. And prints of a dog called
| Queso you'd have paid a lot of money to an illustrator for.
| rgbrgb wrote:
| I think it's more analogous to if you had tweaked one of
| those famous works directly in photoshop then turned it
| in. The model training likely results in near replicas of
| some of the training data encoded in the model. You might
| have a near replica of a famous photograph encoded in
| your head, but to make a similar photograph you would
| recreate it with your own tools and it would probably
| come out pretty different. The AI can just output the
| same pixels.
|
| That's not to say there aren't other ways you might use
| the direct image (e.g. collage or sampling in music) but
| you'll likely be careful with how it's used, how much you
| tweak it, and with attribution. I think the weird problem
| we're butting up against is that AFAIK you can't figure
| out post-facto what the "influence" is from the model
| output aside from looking at the input (which does
| commonly use names of artists).
|
| I work on an AI image generator, so I really do think the
| tech is useful and cool, but I also think it's
| disingenuous (or more generously misinformed) to compare
| it to an artist studying great works or taking
| inspiration from others. These are computers inputting
| and outputting bits. Another human analog would be
| memorizing a politician's speech and using chunks of it
| in your own speech. We'd easily call that plagiarism, but
| if instead every 3 words were exactly the same? Hard to
| say... it's both more and less plagiarism.
|
| Just how much do you need to process a sampled work
| before you need to get permission of the original artist?
| It seems to be in music that if the copyright holder can
| prove you sampled them, even if it's unrecognizable, then
| you're going to be on the hook for some royalties.
| simonw wrote:
| "The model training likely results in near replicas of
| some of the training data encoded in the model."
|
| I don't think that's true.
|
| My understanding is that any image generated by Stable
| Diffusion has been influenced by every single parameter
| of the model - so literally EVERY image in the training
| data has an impact on the final image.
|
| How much of an impact is the thing that's influenced by
| the prompt.
|
| One way to think about it: the Stable Diffusion model can
| be as small as 1.9GB (Web Stable Diffusion). It's trained
| on 2.3 billion images. That works out as 6.6 bits of data
| per image in the training set.
| Jevon23 wrote:
| >It's interesting to ask people who are concerned about the
| training data what they think of Adobe Firefly, which is
| strictly trained on correctly licensed data.
|
| If they truly got an appropriate license agreement for
| every image in the training set then I have no issues with
| that.
|
| >I'm under the impression that DALL-E itself used licensed
| data as well.
|
| DALL-E clearly used images they did not have a license for.
| Early on it was able to output convincing images of Pikachu
| and Homer Simpson. OpenAI certainly didn't get licensing
| rights for those characters.
| einpoklum wrote:
| Stolen artwork? Why, I'm shocked! Shocked and chagrined!
| Where, prey tell, does OpenAI keep that vast warehouse full
| of stolen paintings? And have you alerted Interpol?
| minimaxir wrote:
| Unfortunately it's become a meme among AI art haters that AI
| art is "just inputing text into a text box" despite the fact
| that is far from the truth, particularly if you want to get
| specific results as this blog post demonstrates.
|
| Some modern AI art workflows often require _more_ effort than
| actually illustrating using conventional media. And this blog
| post doesn 't even get into ControlNet.
| squidsoup wrote:
| Only if you exclude the countless hours an illustrator has
| spent developing their craft.
| yieldcrv wrote:
| being sympathetic to that requires pretending that the user
| would have _ever_ commissioned an artist for that idea at
| all. both the transaction and the idea would have simply
| never happened. it was _never_ valuable enough or important
| enough to commission a human, hope you got the correct
| human, wait week after week for revision after revision.
|
| people that want to hone a niche discipline _for
| themselves_ still can do that. just be honest about doing
| it for yourself.
| libraryatnight wrote:
| Using AI as a tool to create art takes nothing away from
| anyone who spent time learning a skill or craft that they
| use in their own pursuit of expression.
|
| People will be arguing about whether or not art made with
| AI is art, and artists will just be using it or not. I
| remember an interview about electronic music where Bjork
| addressed concerns that if you use a computer to make
| music, it has no soul, and she said if the person using the
| machine to make the music puts soul into it, it will have a
| soul.
|
| I remember David Bowie in the mid 90s saying if he was
| young in that decade he might not have been a musician,
| because in the 60s being a musician seemed subversive and
| at the time of the interview the internet was carrying the
| flag of subversion.
|
| Anyway, it's interesting to watch these conversations. I'd
| never claim to know what art is or try to tell someone, but
| it seems to me that already because of the controversy
| artists are drawn to AI and further exciting the
| conversation. Commercial artists seem the most threatened;
| animators, designers, etc. I understand why, but I don't
| think arguing that AI isn't "art" is going to help their
| cause any more than protesting digital painting wasn't art,
| electronic music wasn't art, and much earlier that
| photography wasn't art.
|
| All the time these conversations are happening, the art's
| getting made and we're barreling towards the next 'not art'
| movement.
| capableweb wrote:
| > Some modern AI art workflows often require more effort than
| actually illustrating using conventional media. And this blog
| post doesn't even get into ControlNet.
|
| Indeed. Another criticism that I can definitely somewhat see
| the idea behind, is that the barrier to entry is very
| different from for example drawing. To draw, you need a pen
| and a paper, and you can basically start. To start with
| Stable Diffusion et al, you need either A) paid access to a
| service, B) money to purchase moderately powerful hardware or
| C) money to rent moderately powerful hardware. One way or
| another, if you want to practice AI generated art, you need
| more money than what a pen and paper cost.
| MayeulC wrote:
| > is that the barrier to entry is very different from for
| example drawing
|
| Thqt got me thinking. I agree, but from another
| perspective: the skillset is different. Traditionally, the
| approach to art was very bottom-up. Start with a pen and
| basic contouring techniques. Understanding more advanced
| techniques require a lot of work (perspective, shadows,
| etc).
|
| "AI" art generally does away with basic techniques. The
| emphasis is more on composing, styling. A top-down
| approach. "AI" artists may be able to iterate quicker by
| seeing "almost-finished" versions quickly (though a skilled
| artist can most likely imagine their work pretty well).
|
| But most of all, the tools and required skills are very
| different. You don't need to know a lot about machine
| learning, but it certainly helps. Probably pretty far from
| the skillset of most current artists. And people generally
| fear what they don't understand. And if I was an artist,
| I'd be at least a bit concerned about (i) it undercutting
| the value of my art, (ii) having to learn this alien way of
| doing things to remain competitive (by way of selection,
| artists probably enjoy their current tools.
|
| Anyway, I imagine photography was similarly upsetting in a
| lot of ways. It also didn't happen overnight. I also
| suspect we are going to see similar improvements to output
| quality as in early days of photography.
|
| Another similarity is with digital music (and
| recording/remixing before that). I wonder if we're going to
| see new genres emerge as a result (the equivalent of
| techno/electro).
| Paul-Craft wrote:
| Your comment in particular captures it, but I can imagine
| a lot of the same sort of comments on this post being
| made about film cameras when they came out, then again
| about digital cameras.
| prpl wrote:
| Digital cameras made burst photos go from $.25+ a frame
| at 5FPS to effectively free with rates at 30+FPS now.
| That was transformative but also lead to all sorts of
| lamentations about lack of skill
| rprospero wrote:
| I remember my university photography club trying to get
| digital cameras banned from campus because "art only
| happens in the darkroom".
| dragonwriter wrote:
| > To draw, you need a pen and a paper, and you can
| basically start. To start with Stable Diffusion et al, you
| need either A) paid access to a service, B) money to
| purchase moderately powerful hardware or C) money to rent
| moderately powerful hardware
|
| A 4GB NVidia GPU (sufficient to run Stable Diffusion with
| the A1111 UI) is hardly "moderately powerful hardware",
| and, beyond that, Stable Horde (AI Horde) exists.
|
| OTOH, a computer and internet connection are more expensive
| than a pencil, even if nearly ubiquitous.
| rob74 wrote:
| Yeah, well then, please draw an image of my dog in the
| style of van Gogh, using pen and paper. I would say that
| for most of us, the more cost-effective way to get high
| quality artwork will still be Stable Diffusion...
| realusername wrote:
| Stable Diffusion doesn't really need powerful hardware, any
| graphic card will do, it will just be a bit longer. There's
| even ports on smartphones nowadays.
| simonw wrote:
| There are plenty of traditional art mediums that require
| significant financial outlays to get started: oil painting,
| ceramics, glass blowing etc.
|
| There are plenty of free online tools for using all kinds
| of AI image generation techniques, and they don't require
| powerful hardware, just something that can browse websites
| or run Discord.
| adamm255 wrote:
| Plus training, lessons and inspiration. And talent.
|
| It's like with dreams. They can be terribly intricate and
| detailed, but ask me to draw something creative and I'm
| out.
| smallerfish wrote:
| Stable Diffusion works fine on a CPU - on an AMD Ryzen
| 5700, approx 90s per image (and I believe comparable or
| faster on my old i7-6700). If you want to kick off a batch
| in the background while you work on something else, that's
| plenty fast. (I use:
| https://github.com/brycedrennan/imaginAIry).
| minimaxir wrote:
| The cost has gone _way_ down in the last couple months.
|
| With a super-cheap T4 GPU (free in Google Colab), PyTorch
| 2.0, and the latest diffusers package, you can now generate
| batches of 9-10 images in about the same time it took to 4
| images when Stable Diffusion was first released. This
| drastically speeds up the cherry-picking and iteration
| processes: https://pytorch.org/blog/accelerated-diffusers-
| pt-20/
|
| Google Cloud Platform also now has preview access to L4
| GPUs, which are 1.5x the cost of a T4 GPU but 3x throughput
| for Stable diffusion workflows (maybe more given the
| PyTorch 2.0 improvements for newer architectures), although
| I haven't tested it: https://cloud.google.com/blog/products
| /compute/introducing-g...
| tough wrote:
| We're minmaxing those costs thanks for the data
| tester457 wrote:
| It's a meme because 99% of the ai art creators don't go that
| deep, they only prompt.
|
| Even if they did have a more complex workflow most of them
| are still based on copyrighted training data, so there will
| be many lawsuits.
| basisword wrote:
| > if you're not going to put in the time and effort, why do you
| deserve to create such high equality imagery?
|
| This isn't high quality imagery. Don't get me wrong, the tech
| is cool and I love the work that's went into making this
| picture. But this isn't something I would ever hang on my wall.
| There's probably a market for it, but I get the strong
| impression it's the "live, laugh, love" market. The people that
| buy pictures for their wall in the supermarket. The kind of
| people who pay individual artists money to paint bespoke images
| of their pet are not going to frame AI art. I don't think the
| artists need to worry.
| yellow_postit wrote:
| I would expect it's only a matter of time till those
| "traditional" artists also adopt these tools into their
| workflows. Similar to the initial pushback against the
| "digital darkroom" which is now the mainstay of photography.
|
| In-ai-aided art, like manually developed film, will trend
| towards a niche.
| theaiquestion wrote:
| > This isn't high quality imagery. Don't get me wrong, the
| tech is cool and I love the work that's went into making this
| picture. But this isn't something I would ever hang on my
| wall.
|
| Well yeah but that doesn't change the OP commenter's point
| that it takes a lot of work to get high quality art still.
|
| > I don't think the artists need to worry.
|
| I disagree here but only on the basis of what type of art it
| is. Stock art/photography, and a lot of media designwork is
| likely at risk because we can now create "good enough" art at
| the click of a button for almost no cost. I agree that the
| "hang on the wall level good" artists aren't at risk just
| yet, but between the more filler-art and the uh "anime/furry"
| commissioners are definitely at risk right now for anything
| except the highest quality artists.
| mdp2021 wrote:
| The shruggingface submission is very interesting and very
| instructive.
|
| Nonetheless, it would be odd and a weak argument to point
| criticism towards not spending adequate <<time and effort>> (as
| if it made sense to renounce tools and work through unnecessary
| fatigue and wasting time). More proper criticism could be in
| the direction of "you can produce pleasing graphics but you may
| not know what you are doing".
|
| This said, I'd say that Stable Diffusion is a milestone of a
| tool, incredible to have (though difficult to control). I'd
| also say that the results of the latest Midjourney (though
| quite resistant to control) are at "speechless" level. (Noting
| in case some had not yet checked.)
| Paul-Craft wrote:
| > More proper criticism could be in the direction of "you can
| produce pleasing graphics but you may not know what you are
| doing".
|
| I don't get this. If one "can produce pleasing graphics," how
| does that not equal knowing what they're doing? I only see
| this as being true in the sense of "Sure, you can get places
| quickly in a car, but you don't really know how it works."
| mdp2021 wrote:
| > _how does that not equal knowing what they 're doing_
|
| The goal may not be to produce something pleasant. The
| artist will want some degree of artistic value; the
| communicator will want a high degree of effectiveness etc.
| The professional will implicitly decide a large number of
| details, in a supposedly consistent idea of the full aim.
| The non professional armed with some generative AI tool may
| on the contrary leave a lot to randomness - and obtain a
| "pleasant" result, but without real involvement, without
| being the real author nor, largely, the actual director.
| indigodaddy wrote:
| Pretty cool stuff. Personally though, not a huge fan of his "the
| one" choice. Some of the other images in his assortment were much
| better imo. Each to their own of course though!
| steve_adams_86 wrote:
| I agree, but I find it pretty cool that they were able to
| generate and pick from what they wanted. This seems like one of
| the real strengths of generative AI -- people can tune outputs
| they otherwise couldn't create (unable to paint, draw, play
| guitar, etc).
|
| People can debate if it's actually good that people can create
| art without being artists, but again, I think it's great that
| the author had the freedom to create what they had in mind
| without much outside influence. This has been a goal for
| computers in general for so long, and it seems like we're
| actually arriving with some mediums.
| lxe wrote:
| This is a great writeup on some of the nuances and gotchas you
| have to watch out for when finetuning using dreambooth and the
| generative creative process in general.
| cinntaile wrote:
| It's unfortunate a lot of the nice artsy detail disappeared when
| he had to recreate part of the head, but I guess that is
| inevitable. Great work and interesting writeup.
| yieldcrv wrote:
| Results at the top of your article/project please
| spaceman_2020 wrote:
| I would highly recommend using Photoroom's background removal
| tool. Does a far, far better job than Photoshop.
| EGreg wrote:
| What are the tools we can run on a Linux machine?
|
| EDIT: four downvotes and zero answers how to run it on a Linux
| machine...
| minimaxir wrote:
| You were likely downvoted because you asked how to use it for
| NFTs, which you just edited it out.
| cogitoergofutuo wrote:
| The only piece of software mentioned in the article that
| doesn't run on Linux is Draw Things.
| [deleted]
| [deleted]
| liuliu wrote:
| There might be a few things Draw Things missing from this
| article: no mask blur, not selecting the inpainting model for
| inpainting work.
|
| Tomorrow's release should contain both mask blur and inpainting
| ControlNet, which might help these use cases.
| jakedahn wrote:
| Yeah, it was likely just user error. I actually really love
| Draw Things, because I can run it locally on my mac and quickly
| experiment without having to sling HTTP requests or spin up
| GPUs.
|
| I did the actual work back on March 11th, so I was likely on an
| older build; but I was seeing issues where inpainting was just
| replacing my selection/mask with a white background. I had the
| inpainting model loaded, but couldn't figure it out.
|
| I'm planning to continue playing with Draw Things locally, and
| exploring the inpainting stuff. For such an iterative process I
| feel like a local client would make for the best experience.
| liuliu wrote:
| There is no user error but UX issues :)
|
| That has been said, you probably used paintbrush rather than
| the eraser? There would be more help on the Discord server
| though! https://discord.gg/5gcBeGU58f
| skor wrote:
| I liked the original more than the final version. The vector
| style drawing was much more futuristic and more interesting.
|
| Seems like lots of work went into that and I hope the author
| enjoyed the process and enjoys the final result.
| bigbillheck wrote:
| Personally I paid a friend $200 to create an art portrait of my
| dog.
| AuryGlenz wrote:
| I've done so much with a fine-tuned model of my dog.
|
| I previously made coloring pages for my daughter of our dog as an
| astronaut, wild west sheriff, etc. They're the first pages she
| ever "colored," which was pretty special for us. Currently I'm
| working on making her into every type of Pokemon, just for fun.
| mdp2021 wrote:
| Using which tools, specifically?
| AuryGlenz wrote:
| Stable Diffusion, generically.
|
| StableTuner to fine tune the model - I can't recall the name
| of the model I trained on top of, but it was one of the top
| "broad" 1.5 based models on Civitai. Automatic1111 to do the
| actual generating. I used an anime line art LoRA (at a low
| weight) along with an offset noise LoRA for the coloring book
| pages as otherwise SD makes images be perfectly exposed. For
| something like that you obviously want a lot more white than
| black.
|
| EveryDream2 would be another good tuning solution.
| Unfortunately that end of things is far from easy. There are
| a lot of parameters to change and it's all a bit of a mess. I
| had an almost impossible time doing it with pictures of my
| niece, my wife is hit or miss, her sister worked really well
| for some reason, and our dog was also pretty easy.
| go_discover wrote:
| Do you need an m1 macbook to do this? I have a 2015 macbook
| pro..
| AuryGlenz wrote:
| I uploaded a couple of the Pokemon generations really quick as
| examples. I still need to go through and do quick fixes for
| double tails (the tails on Pokemon are _not_ where they are on
| regular animals, apparently), watermarks, etc. and do a quick
| Img2Img on them.
|
| https://imgur.com/a/11OxoSA
| jakedahn wrote:
| These are great!
| AuryGlenz wrote:
| Thanks. They aren't necessarily the best ones - I just
| uploaded some quickly. Like I said, they still need final
| touches too. I probably should have worked on the prompt a
| bit more before I went all in too.
|
| For anyone else doing it, the ability to do something like
| [vaporeon:cinderdog:.5] so it starts with a specific
| Pokemon and transitions into the dog later was great for
| some types.
|
| One of the fun things about this sort of thing are happy
| accidents. One of the fire types generated as two side by
| side - a puppy and an evolution.
| minimaxir wrote:
| For generating Pokemon, I recommend using this model along
| with a textual inversion of your pet:
| https://huggingface.co/lambdalabs/sd-pokemon-diffusers
___________________________________________________________________
(page generated 2023-04-16 23:00 UTC)