[HN Gopher] Stable Diffusion Textual Inversion
___________________________________________________________________
Stable Diffusion Textual Inversion
Author : antman
Score : 94 points
Date : 2022-08-29 21:08 UTC (1 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| bottlepalm wrote:
| Wow, this is pretty cool. Instead of turning a picture back into
| text, turn it into a unique concept expressed as variable S* that
| can be used in later prompts.
|
| It's like how humans create new words for new ideas, use AI to
| process a visual scene and generate a unique 'word' for it that
| can be used in future prompts.
|
| What would a 'dictionary' of these variables enable? AI with it's
| own language with orders of magnitude more words. Will a language
| be created that interfaces between all these image generation
| systems? Feels like just he beginning here..
| vinkelhake wrote:
| RIP promptbase.com - they had a good run.
| kelseyfrog wrote:
| The won't even need to renew the domain name when it expires on
| 2023-02-28
| daenz wrote:
| This is a big deal! This adds a super power to communication,
| similar to how a photo is worth 1000 words. An inversion is worth
| 1000 diffusions!
| hbn wrote:
| I saw talk the other day how these ML art models aren't really
| suited to doing something like illustrating a picture book
| because it can synthesize a character once but wouldn't be able
| to reliably recreate that character in other situations.
|
| Didn't take long for someone to resolve that issue!
| [deleted]
| zone411 wrote:
| It's not quite at that level yet. The paper introducing it
| recommends using only 5 images as the fine-tuning set so the
| results are not yet very accurate.
| zone411 wrote:
| It should be noted that the official repo now also supports
| Stable Diffusion: https://github.com/rinongal/textual_inversion.
| ionwake wrote:
| Anyone else starting to feel uncomfortable with the rate of
| progress?
| nmca wrote:
| Yes. https://80000hours.org/problem-profiles/artificial-
| intellige...
| kelseyfrog wrote:
| And in a blink of an eye, the career potential of all aspiring
| "Sr. Prompt Engineer"s vanished into the whirlpool of automatable
| tasks.
|
| On a more serious note, this opens up the door to exploring fixed
| points of txt2img->img2txt and img2txt -> txt2img. It may open
| the door to more model interpretability.
| keepquestioning wrote:
| ELI5 - why has there been a cavalcade of Stable Diffusion spam
| on HN recently? What does it all mean?
| fxtentacle wrote:
| You can now fire artists/designers and replace them with AI.
| Obviously, that's cheaper.
| dougmwne wrote:
| Someone will surely come by soon and tell us, "well
| actually... artists and graphic designers are
| irreplaceable."
|
| But for real, plenty of people are going to start rolling
| their own art and skipping the artist. Not Coca-Cola, but
| small to medium businesses doing a brochure or PowerPoint?
| Sure!
| visarga wrote:
| I think there's going to be plenty of work in stacking
| multiple AI prompts or manual retouching to fix rough
| spots. It automates a task, not a job. Some people won't
| use it at all and other people will use it only for
| reference - in the end doing everything by hand, as
| usual, because they have more control and because AI art
| has a specific smell to it and people will associate it
| with cheap.
|
| But it's not just for art and design, it has uses in
| brainstorming, planning, and just to visualise your ideas
| and extend your imagination. It's a bicycle for the mind.
| People will eat it up, old copyrights and jobs be damned.
| It's a cyborg moment when we extend our minds with AI and
| it feels great. By the end of the decade we'll have
| mature models for all modalities. We'll extend our minds
| in many ways, and applications will be countless. There's
| going to be a lot of work created around it.
| djmips wrote:
| For sure it automates some work. For example, my sometime
| hobby of making silly photoshops looks like it will now
| be a whole lot easier... Visual memes can just be a
| sentence now. For more serious work I wonder... But it
| does give pause about what it means for other forms of
| work.
| Eji1700 wrote:
| It'll be an interesting line to be sure.
|
| Right now the tech still requires some nuance to be able
| to slap it all together into what I think most people
| would want.
|
| While i expect the interface and the like to get a lot
| better, all good tutorials of this tech so far show many
| iterations over many different parts of an image to get
| something "cohesive". Blending those little mini
| iterations together is VASTLY easier than just making the
| whole thing, but not just plug and play for something
| professional.
|
| Still there will be a huge dent in how long it takes to
| make certain styles of work and that will lower demand
| considerably, and there's a large market of artists who
| thrive on casual commissions which this might replace.
| CuriouslyC wrote:
| The stable diffusion model just got officially released
| recently, and in the last week a lot of easy to install
| repositories have been forked off the main one, so it's very
| accessible for people to do this at home. Additionally, the
| model is very impressive and it's a lot of fun to use it.
| keepquestioning wrote:
| How does it compare to DALL-E
| CuriouslyC wrote:
| Worse at image cohesion and prompt matching, but
| competitive in terms of final image quality in the better
| cases.
| hbn wrote:
| It's an impressive new technology, and there's nothing else
| out there like it in terms of the model being publicly
| available and able to be run on consumer GPUs.
| dougmwne wrote:
| First, it was recently released, so there's novelty. Second,
| the code and model weights were also released so it is open
| and extensible, which this community loves. Thirdly, these
| high quality image generation models are mind blowing to most
| and it's not hard to imagine how transformative it will be to
| the arts and design space.
|
| If if has any greater meaning, we might all be a little
| nervous that it'll come for our jobs next, or some piece of
| them. First it came for the logo designers, but I was not a
| logo designer, and so on.
| hwers wrote:
| We could always do im2tex via just clip embedding the image.
| The idea that you could hide/sell prompts is silly. (Having
| human interpretable im2tex is cool tho.)
| bravura wrote:
| Is there a colab or easy to use demo of this?
| zone411 wrote:
| It requires a couple hours of actual training, so the barrier
| to entry is higher.
| frebord wrote:
| How many years until we can generate a feature length film from a
| script?
| bitwize wrote:
| I want to see the Batman film where the Joker gives Batman a
| coupon for new parents but it is expired. That should really be
| a real film in theatres.
| djmips wrote:
| you 'might' enjoy. Teen titans fixing the timeline.
| goldenkey wrote:
| I loled.
| anigbrowl wrote:
| 5
|
| You could do storyboards from a shooting script* now, but
| generalizing to synthesizing character and camera movement as
| well as object physics is a ways off.
|
| * A version of the script used mainly by director and
| cinematographer with details of each different angle to be used
| covering the scene.
| bottlepalm wrote:
| It looks like this is trending towards making our
| dreams/thoughts reality in a way in that what we imagine can
| easily be turned into media - music, books, movies, etc.. Pair
| this up with VR 'the metaverse' and you literally do get the
| ability to turn thoughts into personalized explorable
| realities.. what happens after that?
|
| * Do we get lost in it?
|
| * Does today's 'professional' fiction become a lot less
| lucrative when we can create our own?
|
| * Is there a to leverage this technology the improve the human
| condition somehow?
| Loveaway wrote:
| this probably already all happened before mate
| afro88 wrote:
| I think it will encourage novel ideas in all forms of art. In
| other words, genuinely new styles and expression will be
| scarce, because there wasn't thousands of forms of it to
| train a model on yet.
|
| We will also adjust to AI generated art like have other
| creative technologies and the novelty will wear off. We will
| become good at identifying AI generated art and think of it
| as cheap.
|
| Still, extremely exciting.
| xdfgh1112 wrote:
| I can create and explore realities using my imagination
| alone, though. I personally don't think having it become
| actual 2d or 3d art will have a lasting impact. It might be
| fun for a while, but it will get old.
| bottlepalm wrote:
| It's kind of like your imagination on steroids as the
| system creates worlds using you imagination as the seed and
| augments it with summation of all the human creations used
| to train the network. Give Stable Diffusion a sentence for
| example, it will create something way beyond what you could
| of imagined and/or created on your own.
| Eji1700 wrote:
| I suspect at least 10+ depending on your definition.
|
| Tools like this will absolutely be used by professionals to cut
| out portions of the workload, but there's still a large gap
| between something like this and actually making a coherent,
| cohesive, consistent, paced, well framed and lit story from
| text alone.
| globalvisualmem wrote:
| There is also recent work by Google called DreamBooth, though
| similar to Imagen/Parti Google refuses to release any model or
| code.
|
| https://dreambooth.github.io/
| cube2222 wrote:
| It's impressive how all of this is quickly picking up steam
| thanks to the Stable Diffusion model being open source with
| pertained weights available. It's like every week there's another
| breakthrough or two.
|
| I think the main issue here is the computational cost, as - if I
| understand correctly - you basically have to do training for each
| concept you want to learn. Are pretrained embeddings available
| anywhere for common words?
___________________________________________________________________
(page generated 2022-08-29 23:00 UTC)