[HN Gopher] TheseToonsDoNotExist: StyleGAN2-ADA trained on CG ca...
___________________________________________________________________
TheseToonsDoNotExist: StyleGAN2-ADA trained on CG cartoon faces
Author : codetrotter
Score : 107 points
Date : 2021-09-25 14:33 UTC (8 hours ago)
(HTM) web link (www.thesetoonsdonotexist.com)
(TXT) w3m dump (www.thesetoonsdonotexist.com)
| andreygrehov wrote:
| The other way to achieve similar result is to load a face from
| ThisPersonDoesNotExist.com [1] and then pipe it through
| Toonify.Photos [2]. Give it a label from ThisWordDoesNotExist.com
| [3] and here you go - you've got a character :)
|
| Edit: wire it up with Stripe and sell characters to Pixar? Ha!
|
| [1] https://news.ycombinator.com/item?id=19144280
|
| [2] https://news.ycombinator.com/item?id=24494377
|
| [3] https://news.ycombinator.com/item?id=23169962
| hn_throwaway_99 wrote:
| Most of the toons I got generally look like normal characters but
| with a mild genetic defect: a lazy eye, potentially concerning
| lump on a cheek, etc.
|
| Edit: actually, yeah, after looking at more examples nearly every
| one has some amount of cross-eye/focus disorder where both eyes
| aren't pointing in the same direction.
| techrat wrote:
| Like with thispersondoesnotexist, the toons also tend to have
| severe problems with the ears. Nearly every one of them has an
| ear that blends into the background or is disproportionately
| unmatched.
| arketyp wrote:
| StyleGAN interpolations of latent space are mindblowing but they
| are fundamentally superficial, which you can at times tell and
| more often get bored of. Instead, I would like to see a
| transformer network trained on ontogenetic/phylogenetic
| development/evolution material which could then generate new
| creatures. The representations could be abstract, identified by
| key topological properties for instance, which could then be used
| to do whatever artistic renderings with. Of course, in the end,
| perhaps the truest abstract representation would be genes and
| proteins.
| high_byte wrote:
| lol. good idea but genes only get you so far. environment is
| where you get those proteins so that means to stimulate the
| world. ;)
|
| just to get lots of retarded-baby monkey-fish-frogs.
| qayxc wrote:
| Unfortunately mainstream ML research is obsessed with end-to-
| end solutions, so a lot of interesting ideas fall by the
| wayside.
|
| A rule-based system combined with a Transformer and CV-based
| postprocessing to filter the most plausible and interesting
| results would be awesome.
| backspace_ wrote:
| It's like they took disney, Pixar, and DreamWorks and just mashed
| it all together.
| dqpb wrote:
| This would be much more useful if it generated a model.
| ravenstine wrote:
| I don't see why that would be particularly difficult to
| accomplish. A dataset made up of 3D assets would actually give
| an algorithm more information to work with. The question is
| whether it could generate a _usable_ model and not an Eldrich
| horror of disconnected triangles and non-manifold geometry.
|
| The easiest win would probably be to have an algorithm pick
| between predetermined types of assets (heads, appendages,
| clothes, etc), reshape them without actually adding new
| geometry, and then doing essentially what the linked page does
| with skins and shaders.
| barbecue_sauce wrote:
| At that point, you might as well just make a shape-key based
| character creator.
| codegladiator wrote:
| pixar
| stared wrote:
| I would love to see character names and synopsis (e.g. with GPT)
| - vide https://www.thiswaifudoesnotexist.net/.
| aasasd wrote:
| Yeah, the 'DreamWorks face' is strong with this one:
| https://filmschoolrejects.com/wp-content/uploads/2017/03/dre...
|
| Though exhibited more as some kind of a confused smirk. Which is
| of course also ubiquitous in cartoons for some reason.
| hi5dev wrote:
| Wow, the app powering this requires a minimum of at least 1 high
| end NVIDIA GPU with at least 12GB of GPU memory with 8 GPUs
| recommended. All to generate some cartoon faces using AI. At
| least for that website. Dang.
| gwern wrote:
| I think that's just the training requirements. For simple TXDNE
| sites, you don't need so much as a single GPU as you can
| pregenerate them all.
| hi5dev wrote:
| Makes sense.
|
| I was going to set it up to mess around with until I saw the
| requirements.
| [deleted]
| diskzero wrote:
| What would be really interesting, and possibly impossible to do,
| would be to get all the character designs that were rejected
| during the development process at the various studios. As has
| been pointed out in other comments, the GAN is generating some
| pretty boring variations of faces from source material that has
| been focus grouped and designed by committee to death before
| being released to the public.
|
| I have seen some really cool and wacky designs in the hallways of
| DreamWorks, Blue Sky, Pixar, etc. during my time in the industry.
| I would love to get all of those designs into a training set as
| well.
| aaaaaaaaaaab wrote:
| Yuck... So this pixar 3D crap is what's called "cartoons"
| nowadays.
| peter-m80 wrote:
| Is Pixar 3D crap?
| aaaaaaaaaaab wrote:
| Compared to real handdrawn cartoons? Yes. Soulless, uniform,
| factory-produced crap.
| krapp wrote:
| >factory-produced crap.
|
| Wait until you find out how hand-drawn animation is made.
| vlunkr wrote:
| A quick Google shows that hand drawn animation has been
| outsourced since the 60s.
| krapp wrote:
| Yes. Mostly to high-volume animation sweatshops, aka
| "soulless, uniform factory production."
| steve_adams_86 wrote:
| From a technical and artistic perspective, I can't think of
| much to criticize. I think of us as fortunate to have such
| incredible accomplishments in story telling and
| presentation.
|
| There are modern 3d movies I particularly dislike and many
| I don't like or love. I just don't agree that the medium is
| innately lacking.
| ChrisClark wrote:
| Now get off my lawn!
| codetrotter wrote:
| Did someone say factory-produced? I love the classical
| animations too, but you are aware of this right?:
| https://youtu.be/hjmaOj3_sKk
|
| :^)
| mongol wrote:
| Looks like it was mainly Robin Hood that was taking the
| cheap way out. And Aristocats? I think that was during a
| low point of Disney animations.
| codetrotter wrote:
| Here's a comprehensive overview, including both the full
| length movies that you saw in that YouTube video and a
| lot of shorter animations that were reused.
|
| https://disney.fandom.com/wiki/List_of_recycled_animation
| _in...
|
| Definitely something they did quite a bit more than just
| a couple of times.
|
| Not saying there's anything "wrong" with that per se btw.
| Just found it relevant to the discussion about comparing
| modern animation to factory output.
| colordrops wrote:
| Yeah, it's all the same look. Not much experimentation or
| creativity at all when it comes to character design, at least
| for human characters.
| ergot_vacation wrote:
| Well, really "Pixar." Pixar at least knew how leverage their
| weird style; everyone else ran with it and made it god awful.
| But my thought exactly. Why even do this with something that's
| so offensively awful to look at?
|
| That gripe aside, if you're just training on a bunch of
| headshots and generating new ones, it's been done, over and
| over at this point. Want to impress? Figure out how to generate
| a full sequence of coherent animation frames.
| joshspankit wrote:
| "Do all current movie 'toons' have one eyebrow up?"
|
| I was suddenly struck by this question, and think there might be
| something to it. Clearly it was a standard feature of the
| training set.
| slavik81 wrote:
| It seems to be generating a lot of characters with "DreamWorks
| Face".
| https://tvtropes.org/pmwiki/pmwiki.php/Main/DreamworksFace
| joshspankit wrote:
| Solid reference. Thank you
| tomtimtall wrote:
| This much more clearly displays the problem with all these "this
| service does not exist" because even though it doesn't show the
| source material it's abundantly clear that these are just simple
| stitches of the training toons.
|
| Like you get Elsa with different hair or the up grandpa in a
| suit.
|
| Once you compare them to the closest examples from the training
| date it becomes a lot less impressive than the implied "this face
| is completely out of the imagination of a AI model" turns out the
| model just imagined someone in the training set with the hair of
| someone else. Quite boring.
| the8472 wrote:
| > it's abundantly clear that these are just simple stitches of
| the training toons.
|
| That's not how GANs work in general. With a limited training
| set the end result may appear _as if_ they were stitched
| together samples, but that 's not what happens under the hood.
|
| Interpolation videos make it obvious that it encodes visual
| concepts, can freely manipulate them and even crank up
| parameters beyond anything found in the training set, thus
| giving exaggerated results.
| exit wrote:
| i think the results shown in this paper contradict your
| assertion:
|
| https://openaccess.thecvf.com/content_ICCV_2019/papers/Abdal...
|
| given an arbitrary face, we can find its embedding in the
| latent space of the model. this shows that the model has the
| potential to generalise to real but unseen examples?
|
| on the other hand, i suspect you might be observing a bias in
| the structuring of the latent space.
|
| thispersondoesnotexist.com likely samples the latent space with
| a gaussian or uniform distribution, and while the latent space
| may contain the full spectrum of possibilities, the density of
| semantically meaningful embeddings may be structured around the
| distribution of the training set rather than a uniform or
| gaussian.
|
| i'm stretching my understanding of the topic in trying to
| convey this.
| gfodor wrote:
| How do you know this is the case for all of them?
| qayxc wrote:
| A good indicator is that when I clicked on the "more..."
| button, I _instantly_ recognised copied featured (eyes, face
| shape, nose, hair) from The Incredibles and Frozen in 3 out
| of the 4 samples just mashed together.
|
| This shouldn't be the case unless you start actively looking
| for it.
|
| It's just much easier to recognise with these cartoon
| characters than with realistic faces as there's naturally
| much less variety in the training material. Also the features
| are simplified to a point of being easily recognisable as
| well.
| version_five wrote:
| As others have said, that's not the way a GAN should work.
| Regurgitating the training set is basically a failure mode that
| is actively avoided when the models are build and trained.
|
| Looking at these images, and not familiar with how the
| underlying CG training set is made, I wonder if the original
| series itself has some comparatively small set of latent
| features - dimensions you could adjust when drawing the faces -
| that the model is just learning, so that newly generated faces
| are effectively the same thing as if one had changed whatever
| setting you tweak when working with the underlying tool.
| Tenoke wrote:
| I see what you mean but this is definitely not universal about
| stylegan and it depends on factors such as size of the training
| set (I'm guessing it was smaller here) and training parameters.
| godelski wrote:
| Honestly this seems to be common for GANs in general. Though
| I don't think most people have looked through CelebA. But if
| you are lazier, you can scroll through thispersondoesnotexist
| and you'll find essentially celebrities with similar
| characteristics to what the OP is saying. More so, you
| actually see better quality images the closer to a celebrity
| they look (you see the same thing in the tune version here).
| I do think ADA is typically worse than the typical StyleGAN2,
| but that's the tradeoff you get with a smaller sample size
| (worse because people are training it on smaller datasets so
| more memorization).
| Tenoke wrote:
| I believe thispersondoesnotexist is also trained on FFHQ
| not just CelebA though.
| Pxtl wrote:
| It's really easy to spot the training references in these - lots
| of Wreck It Ralph and Big Hero Six characters with slightly
| reshaped face and hair.
|
| Edit: whenever it gets inspired by Mr Incredible the result ends
| up looking like Conan O'Brien.
| sam-2727 wrote:
| This is interesting -- I can almost recognize the various Pixar
| characters these are influenced by. For instance, one has the
| distinct jaw of the old man from "up".
| nhinck wrote:
| I always did wonder how much StyleGAN was compositing existing
| features rather than generating wholly distinct features.
|
| With real human faces it's almost impossible to tell but with
| these you can definitely pick a character per feature.
| mysterydip wrote:
| Begs the question at what point does ownership change. Can I
| trace batman but change his hair color? what about just his
| face? or just his mouth? Can I cut and paste a bunch of
| superheroes together and claim it as my own? etc
| zokier wrote:
| I guess that's the big debate around GitHub Copilot and open
| source code, especially GPL (but also others).
| high_byte wrote:
| I think that would depend on the license of each image on the
| dataset that influenced the output image. maybe not but it
| would make sense to be this way
| routerl wrote:
| This will certainly be the crux of intellectual property
| lawsuits in the future.
|
| It's striking that some of these examples have distinct
| features from specific, identifiable datasets: we can
| occasionally recognize specific characters (the old man from
| Pixar's UP is getting mentioned a lot), but it also
| reproduces more general aesthetic patterns. Even when I can't
| recognize the source data, I can distinctly see in some of
| these faces "the Pixar look", and in others "the DreamWorks
| look".
|
| Were I an IP lawyer, I would start thinking of arguments
| along the lines of "this technology simply obfuscates the
| source of plagiarisms". I would also start to think about
| trying to force anyone who uses this technology to disclose
| the sources of their training data, since a model trained
| largely on "the Pixar look" could be benefiting from Pixar's
| character design processes without having to hire any of
| Pixar's artists.
|
| And, if I were philosophically inclined, I would also start
| thinking about how this is any different from hiring a random
| artist and instructing them to "design characters that look
| like Pixar characters".
|
| I suspect that one key difference is that the human artist's
| success can't easily be measured, but the GAN's success can
| very easily be measured.
| amelius wrote:
| At some point big producers like Pixar will probably use
| something like StyleGAN to extend their copyright coverage.
| E.g. generate as many variations as they can, which then all
| fall under their own copyright.
|
| So in the end this technology might not be as "liberating" as
| people think it is.
| godelski wrote:
| Wouldn't it make more sense to use a density based model
| and then describe some hull centered around your original
| creation?
| echelon wrote:
| Not if a tech firm gets there first. Imagine the productive
| and legal power being in the algorithm.
|
| In a way, this is a much better setup for artists and
| creatives. There isn't some giant licensing firm
| controlling your work. You simply buy or rent the best
| tools to make your work.
|
| That said, it'll only be good for creatives and consumers
| if there is sufficient competition. And open source
| equivalents that still enable creation.
| itronitron wrote:
| The rights surrounding caricatures may provide some insight
| here. I know some celebrities are particularly vigilant about
| keeping unapproved photos of their faces out of circulation
| but they probably wouldn't have the same success with a hand-
| drawn likeness or caricature.
| [deleted]
| ghoomketu wrote:
| Really interesting. Speaking as a total noob, if somebody wanted
| to make thispersondoesnotexist type of program, e.g. this site:
|
| 1) how much approx cost will it be to rent the servers to train
| such models?
|
| 2) Can this be done on a home computer running a $1k nvidia card
| in a reasonable time?
|
| 3) Can i use free tools like google colab(?) for this purpose?
|
| I've always been interested in learning more about this field but
| haven't really bothered because I feel it would cost an arm and
| leg just to experiment. Can somebody please shed some light on
| this?
| Tenoke wrote:
| For yourself you can use colab. For serving it as a site single
| $1k gpu is fine but depends on traffic.
| qayxc wrote:
| 1) That depends entirely on the model in question (size,
| complexity) and the amount of training material.
|
| Using a standard dataset like CelebA [1] and an "HQ" model
| (512x512) like StyleGAN2, training requires at least 1 GPU with
| 12GiB of VRAM and training of about a week with a single V100
| GPU.
|
| Depending on your provider of choice, this will cost anywhere
| from ~$514 (AWS), ~$420 (Google) to $210 (Lambda Labs, RTX 6000
| - should be in the same ballpark).
|
| If your training process is interruptible and can be resumed at
| any time (most training scripts support this), costs will drop
| _dramatically_ for AWS and Google (think $50 to $200).
|
| 2) Yes. A used ~$200 Tesla K80 will do. Alternatively any
| NVIDIA card with at least 8 GiB of VRAM is capable of doing the
| job, but lower batch sizes and increased training time are to
| be expected. If you can use a dedicated machine with an RTX
| 3060 or a brand new A4000 (if you're willing to pay the
| premium), close to a week of training time can be achieved.
|
| 3) Yes*
|
| *your work will be freely available to everyone and your
| training process is limited to 12h or so per day.
|
| All in all I wouldn't recommend training a StyleGAN model from
| scratch anyway. Finetuning a pretrained model using your own
| dataset can be done much more quickly (think hours to a day or
| two) and on consumer-level hardware (I train my models on an
| old desktop with a GTX 1070).
|
| [1] http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
| codetrotter wrote:
| > Finetuning a pretrained model using your own dataset can be
| done much more quickly (think hours to a day or two) and on
| consumer-level hardware (I train my models on an old desktop
| with a GTX 1070).
|
| This is interesting! Do you have some links about doing that?
|
| My desktop computer has a GTX 1060 with 6 GB of VRAM. But
| hopefully I can use it for something like this.
|
| I've only used Google Colab in the past, and only tried stuff
| with prompting existing models.
|
| Would love to experiment a bit with fine-tuning models on my
| own datasets to get some kind of unique stuff.
| godelski wrote:
| It is the same transfer learning you would do with any
| model. StyleGAN (and StyleGAN2-ADA) provide pretrained
| weights for you. Just start there and train on the new
| dataset. The ADA github even has tools to format your
| dataset correctly.
| [deleted]
| 01100011 wrote:
| Now train it to generate cartoons similar to a human face or a
| dog and take in the $$$.
| kortex wrote:
| Pixarize-my-pets? Sign me up!
| corysama wrote:
| https://linktr.ee/voilaaiartist Mobile app does a good job with
| people. I'd be surprised if it works on dogs.
| kortex wrote:
| I like how these faces exhibit many of the usual face gan quirks:
| asymmetric ears, eyes with slight nystagmus, background is some
| abstract blurry surreal mosaic. At least they get the hair pretty
| darn good? Probably because the hair is more regular to begin
| with.
| peanut_worm wrote:
| Most of these look like warped versions of existing characters.
| Maybe the dataset is too small ot something.
| qayxc wrote:
| > Maybe the dataset is too small ot something.
|
| Yeah, I think that might be the problem here. While there's
| plenty of material for real human faces, there's only so many
| high quality 3D cartoon characters.
| ogurechny wrote:
| Way too boring, and has, like, 5 reference faces. I expected a
| crazy mix of all possible cartoon styles similar to
| thisanimedoesnotexist.
| gwern wrote:
| TADNE is trained on n ~2.8m source images (augmented to ~4m),
| covering k ~ tens of thousands of different characters. OP is
| trained on... I'm not sure because the site is devoid of any
| information, but I would guess it's closer to 10k images from a
| few hundred characters at most. So the diversity will be
| drastically reduced, although the use of -ADA should mean that
| it doesn't overfit as catastrophically as one would expect from
| previous GANs to such small n/k.
___________________________________________________________________
(page generated 2021-09-25 23:01 UTC)