https://waxy.org/2022/11/invasive-diffusion-how-one-unwilling-illustrator-found-herself-turned-into-an-ai-model/ Skip to content Waxy.org About Twitter Contact Invasive Diffusion: How one unwilling illustrator found herself turned into an AI model Posted November 1, 2022November 1, 2022 by Andy Baio Last weekend, Hollie Mengert woke up to an email pointing her to a Reddit thread, the first of several messages from friends and fans, informing the Los Angeles-based illustrator and character designer that she was now an AI model. The day before, a Redditor named MysteryInc152 posted on the Stable Diffusion subreddit, "2D illustration Styles are scarce on Stable Diffusion, so I created a DreamBooth model inspired by Hollie Mengert's work." Using 32 of her illustrations, MysteryInc152 fine-tuned Stable Diffusion to recreate Hollie Mengert's style. He then released the checkpoint under an open license for anyone to use. The model uses her name as the identifier for prompts: "illustration of a princess in the forest, holliemengert artstyle," for example. [comparison2-1024x499]Artwork by Hollie Mengert (left) vs. images generated with Stable Diffusion DreamBooth in her style (right) The post sparked a debate in the comments about the ethics of fine-tuning an AI on the work of a specific living artist, even as new fine-tuned models are posted daily. The most-upvoted comment asked, "Whether it's legal or not, how do you think this artist feels now that thousands of people can now copy her style of works almost exactly?" Great question! How did Hollie Mengert feel about her art being used in this way, and what did MysteryInc152 think about the explosive reaction to it? I spoke to both of them to find out -- but first, I wanted to understand more about how DreamBooth is changing generative image AI. --------------------------------------------------------------------- Since its release in late August, I've written about the explosive creativity and complex ethical and legal debates unleashed by the open-source release of Stable Diffusion, explored the billions of images it was trained on, and talked about the data laundering that shields corporations like Stability AI from accountability. By now, we've all heard stories of artists who have unwillingly found their work used to train generative AI models, the frustration of being turned into a popular prompt for people to mimic you, or how Stable Diffusion was being used to generate pornographic images of celebrities. But since its release, Stable Diffusion could really only depict the artists, celebrities, and other notable people who were popular enough to be well-represented in the model training data. Simply put, a diffusion model can't generate images with subjects and styles that it hasn't seen very much. --------------------------------------------------------------------- When Stable Diffusion was first released, I tried to generate images of myself, but even though there are a bunch of photos of me online, there weren't enough for the model to understand what I looked like. [me_vs_untrained_ai-1024x506]Real photos of me (left) vs. Stable Diffusion output for the prompt "portrait of andy baio" (right) That's true of even some famous actors and characters: while it can make a spot-on Mickey Mouse or Charlize Theron, it really struggles with Garfield and Danny DeVito. It knows that Garfield's an orange cartoon cat and Danny DeVito's general features and body shape, but not well enough to recognizably render either of them. On August 26, Google AI announced DreamBooth, a technique for introducing new subjects to a pretrained text-to-image diffusion model, training it with as little 3-5 images of a person, object, or style. Today, along with my collaborators at @GoogleAI, we announce DreamBooth! It allows a user to generate a subject of choice (pet, object, etc.) in myriad contexts and with text-guided semantic variations! The options are endless. (Thread ) webpage: https://t.co/EDpIyalqiK 1/N pic.twitter.com/FhHFAMtLwS -- Nataniel Ruiz (@natanielruizg) August 26, 2022 Google's researchers didn't release any code, citing the potential "societal impact" risk that "malicious parties might try to use such images to mislead viewers." Nonetheless, 11 days later, an AWS AI engineer released the first public implementation of DreamBooth using Stable Diffusion, open-source and available to everyone. Since then, there have been several dramatic optimizations in speed, usability, and memory requirements, making it extremely accessible to fine-tune it on multiple subjects quickly and easily. --------------------------------------------------------------------- Yesterday, I used a simple YouTube tutorial and a popular Google Colab notebook to fine-tune Stable Diffusion on 30 cropped 512x512 photos of me. The entire process, start to finish, took about 20 minutes and cost me about $0.40. (You can do it for free but it takes 2-3 times as long, so I paid for a faster Colab Pro GPU.) The result felt like I opened a door to the multiverse, like remaking that scene from Everything Everywhere All at Once, but with me instead of Michelle Yeoh. [me_dreambooth_generations-1024x1024]Sample generations of me as a viking, anime, stained glass, vaporwave, Pixar character, Dali/ Magritte painting, Greek statue, muppet, and Captain America Frankly, it was shocking how little effort it took, how cheap it was, and how immediately fun the results were to play with. Unsurprisingly, a bunch of startups have popped up to make it even easier to DreamBooth yourself, including Astria, Avatar AI, and ProfilePicture.ai. But, of course, there's nothing stopping you from using DreamBooth on someone, or something, else. --------------------------------------------------------------------- I talked to Hollie Mengert about her experience last week. "My initial reaction was that it felt invasive that my name was on this tool, I didn't know anything about it and wasn't asked about it," she said. "If I had been asked if they could do this, I wouldn't have said yes." She couldn't have granted permission to use all the images, even if she wanted to. "I noticed a lot of images that were fed to the AI were things that I did for clients like Disney and Penguin Random House. They paid me to make those images for them and they now own those images. I never post those images without their permission, and nobody else should be able to use them without their permission either. So even if he had asked me and said, can I use these? I couldn't have told him yes to those." She had concerns that the fine-tuned model was associated with her name, in part because it didn't really represent what makes her work unique. "What I pride myself on as an artist are authentic expressions, appealing design, and relatable characters. And I feel like that is something that I see AI, in general, struggle with most of all," Hollie said. [comparison-1024x499]Four of Hollie's illustrations used to train the AI model (left) and sample AI output (right) "I feel like AI can kind of mimic brush textures and rendering, and pick up on some colors and shapes, but that's not necessarily what makes you really hireable as an illustrator or designer. If you think about it, the rendering, brushstrokes, and colors are the most surface-level area of art. I think what people will ultimately connect to in art is a lovable, relatable character. And I'm seeing AI struggling with that." "As far as the characters, I didn't see myself in it. I didn't personally see the AI making decisions that that I would make, so I did feel distance from the results. Some of that frustrated me because it feels like it isn't actually mimicking my style, and yet my name is still part of the tool." She wondered if the model's creator simply didn't think of her as a person. "I kind of feel like when they created the tool, they were thinking of me as more of a brand or something, rather than a person who worked on their art and tried to hone things, and that certain things that I illustrate are a reflection of my life and experiences that I've had. Because I don't think if a person was thinking about it that way that they would have done it. I think it's much easier to just convince yourself that you're training it to be like an art style, but there's like a person behind that art style." "For me, personally, it feels like someone's taking work that I've done, you know, things that I've learned -- I've been a working artist since I graduated art school in 2011 -- and is using it to create art that that I didn't consent to and didn't give permission for," she said. "I think the biggest thing for me is just that my name is attached to it. Because it's one thing to be like, this is a stylized image creator. Then if people make something weird with it, something that doesn't look like me, then I have some distance from it. But to have my name on it is ultimately very uncomfortable and invasive for me." --------------------------------------------------------------------- I reached out to MysteryInc152 on Reddit to see if they'd be willing to talk about their work, and we set up a call. MysteryInc152 is Ogbogu Kalu, a young Nigerian engineer living and working in Halifax, Canada. Ogbogu is a fan of fantasy novels and football, comics and animation, and now, generative AI. His initial hope was to make a series of comic books, but knew that doing it on his own would take years, even if he had the writing and drawing skills. When he first discovered Midjourney, he got excited and realized that it could work well for his project, and then Stable Diffusion dropped. Unlike Midjourney, Stable Diffusion was entirely free, open-source, and supported powerful creative tools like img2img, inpainting, and outpainting. It was nearly perfect, but achieving a consistent 2D comic book style was still a struggle. He first tried hypernetwork style training, without much success, but DreamBooth finally gave him the results he was looking for. Before publishing his model, Ogbogu wasn't familiar with Hollie Mengert's work at all. He was helping another Stable Diffusion user on Reddit who was struggling to fine-tune a model on Hollie's work and getting lackluster results. He refined the image training set, got to work, and published the results the following day. He told me the training process took about 2.5 hours on a GPU at Vast.ai, and cost less than $2. Reading the Reddit thread, his stance on the ethics seemed to border on fatalism: the technology is inevitable, everyone using it is equally culpable, and any moral line is completely arbitrary. In the Reddit thread, he debated with those pointing out a difference between using Stable Diffusion as-is and fine-tuning an AI on a single living artist: There is no argument based on morality. That's just an arbitrary line drawn on the sand. I don't really care if you think this is right or wrong. You either use Stable Diffusion and contribute to the destruction of the current industry or you don't. People who think they can use [Stable Diffusion] but are the 'good guys' because of some funny imaginary line they've drawn are deceiving themselves. There is no functional difference. On our call, I asked him what he thought about the debate. His take was very practical: he thinks it's legal to train and use, likely to be determined fair use in court, and you can't copyright a style. Even though you can recreate subjects and styles with high fidelity, the original images themselves aren't stored in the Stable Diffusion model, with over 100 terabytes of images used to create a tiny 4 GB model. He also thinks it's inevitable: Adobe is adding generative AI tools to Photoshop, Microsoft is adding an image generator to their design suite. "The technology is here, like we've seen countless times throughout history." Toward the end of our conversation, I asked, "If it's fair use, it doesn't really matter in the eye of the law what the artist thinks. But do you think, having done this yourself and released a model, if they don't find flattering, should the artist have any say in how their work is used?" He paused for a few seconds. "Yeah, that's... that's a different... I guess it all depends. This case is rather different in the sense that it directly uses the work of the artists themselves to replace them." Ultimately, he thinks many of the objections to it are a misunderstanding of how it works: it's not a form of collage, it's creating new images and clearly transformative, more like "trying to recall a vivid memory from your past." "I personally think it's transformative," he concluded. "If it is, then I guess artists won't really have a say in how these models get written or not." --------------------------------------------------------------------- As I was playing around with the model trained on myself, I started thinking about how cheap and easy it was to make. In the short term, we're going to see fine-tuned for anything you can imagine: there are nearly 100 models in the Concepts Library on HuggingFace so far, and trending in the last week alone on Reddit, models based on classic Disney animated films, modern Disney animated films, Tron: Legacy, Cyberpunk: Edgerunners, K-pop singers, and Kurzgesagt videos. [image-1-1024x704]Images generated using the "Classic Animation" DreamBooth model trained on Disney animated films Aside from the IP issues, it's absolutely going to be used by bad actors: models fine-tuned on images of exes, co-workers, and, of course, popular targets of online harassment campaigns. Combining those with any of the emerging NSFW models trained on large corpuses of porn is a disturbing inevitability. It's a complicated issue. DreamBooth, like most generative AI, has incredible creative potential, as well as the potential for harm. Missing in most of these conversations is any discussion of consent: are you treating people the way you would want to be treated? --------------------------------------------------------------------- The day after we spoke, Ogbogu Kalu reached out to me through Reddit to see how things went with Hollie. I said she wasn't happy about it, that it felt invasive and she had concerns about it being associated with her name. If asked for permission, she would have said no, but she also didn't own the rights to several of the images and couldn't have given permission even if she wanted to. "I figured. That's fair enough," he responded. "I did think about using her name as a token or not, but I figured since it was a single artist, that would be best. Didn't want it to seem like I was training on an artist and obscuring their influence, if that makes sense. Can't change that now unfortunately but I can make it clear she's not involved." Two minutes later, he renamed the Huggingface model from hollie-mengert-artstyle to the more generic Illustration-Diffusion, and added a line to the README, "Hollie is not affiliated with this." Two days later, he released a new model trained on 40 images by concept and comic book artist James Daly III. [comparison_daly-1024x505]Art by James Daly III (left) vs. images generated by Stable Diffusion fine-tuned on his work - Previous Post Comments Adpah says: November 1, 2022 at 12:30pm The "It's not a collage" argument: ""[...]He thinks many of the objections to it are a misunderstanding of how it works: it's not a form of collage, it's creating new images and clearly transformative, more like "trying to recall a vivid memory from your past."" I am so, so very tired of these cheap and lazy "arguments" that try to build their alleged "check-mate" moment on the idea of "people just don't get AI. It's not a collage. The data is not stored. It's parameters.", misleading people who are less tech-savvy into believing them whereas this point of theirs does nothing to actually support their position. The point is: It doesn't matter. It does not matter that AI does not patch actual pixels from actual images together into a collage to determine whether using data by other people as training and validation data to build an AI model is legal, at least from a commercial perspective. It's true, AI does not store the images. It analyzes them, derives patterns, rules, relationships etc., and stores its analysis results as abstract mathematical parameters. It does not change the fact that you would not be able to build your tool without the data. It is completely irrelevant that the data is discarded. It is still a crucial component in your process of creating your software. Without it, it would literally not be possible. This means, you are dependent on the data to build your tool. But if you depend on the property of other people then you need to get their permission to use it for commercial purposes and reimburse them if they demand it. It's not "transformative". It is not "reliving a vivid memory from the past". This has to be the most misleading and incorrect "assessments" I have read on this issue. Data is literally the heart of AI. Good data is one of the biggest treasures for AI creation. It is immensely valuable. That is why big tech companies are obsessed with collecting it and pay millions to acquire it. Sometimes they even buy an entire vacuum company (Amazon buying Roomba) just to get access to its treasure trove of data. It bears absolutely zero logic, then, that in the case of art AI the very people who create these valuable components should neither have any rights to their property, any say in how their property is used and any rights to compensation. ("If it is [transformative], then I guess artists won't really have a say in how these models get written or not.") (Based on his comments I do not take it as him only wanting to use AI art programs, that use copyrighted data without permission and compensation, for non-commercial purposes. He very explicitly talks about tools like Stable Diffusion destroying the industry, i.e., a commercial entity. He also said he wanted to make a comic series, so unless it's a complete non-profit comic without any intention to ever generate any commercial revenue, he sounds like being in favor of people creating content that can be commercialized with such art AI tools. Additionally, his emphasis on the "transformative" nature of these tools makes limited sense if it is exclusively applied to non-commercial content, as the debate around transformative work is generally tied to commercial copyright and whether something is infringing it or transformative enough to be legally in the clear. Also, some programs like Midjourney offer payment plans, directly benefitting commercially from the data they used to train their tool with, and people defend these tools with the very same arguments.) Fair Use The fair use argument is also unconvincing in my eyes. At least where I live fair use has little to do with whether actual pictures are used or not. You can already use actual images - pixels - noncommercially under fair use in certain situations. So, saying, it's fair use because the images are not actually stored makes no sense because it completely misses the nature of what fair use constitutes. Images not being stored and something being fair use are not actually related. The deciding factor should be again whether the outcome of whatever you do with that data is commercial or not. "You can't copy style." Lastly, the "you can't copy style" argument is another appeal to a completely different topic that does not actually describe the problem at hand. It's a distraction. First, what he did is not just "copying a style" or "being inspired" in the way a human artist is. He literally took the property (actual image data) of other people to build a tool, a software, that needs said actual data as a component to be created. He can't use mere abstract concepts and ideas represented as thoughts in his brain to train and validate - to build - his model. Again, he needs actual, "tangible" property. He is dependent on other people's output, work results, in his own model building process. (Also, if he wants to create a depiction in the style of Mickey Mouse (not Mickey Mouse himself) he will have to feed his AI model actual images of Mickey Mouse, image material created and licensed by Disney. He is still using "tangible" property (actual data) that is copyright protected in the creation of his model to imitate the Mickey Mouse style.) I know, uncritical art AI supporters love to say "it's inspiration!" but unfortunately humans and AI are still different entities. Humans are not AI models that are built by another agent and AI models are not sentient, self-contained minds. I am aware that the argument of sentience is often dismissed with a scoff. Perhaps because it's so not technical, rather philosophical and so difficult to define neatly unlike technology and software. Despite naming certain data processing approaches "neural networks" in machine learning, the processes in such neural networks and in the sentient mind(!) of a person are not the same. (See how I say "mind" instead of "brain". Although I would also argue that currently the actual brain and artificial neural networks are not (yet) to be regarded as interchangeable either.) Sentient beings "process" data first and foremost for the self-contained purpose of existing. Unfortunately, we need the input of our eyes and ears (and touch and smell etc.) to navigate this world, to survive. We cannot shut down our perceptive ("collecting data" with our eyes, ears etc.) and higher-order cognitive processes ("analyzing that data" to classify it as usable information and to derive meaningful conclusions from it) because we are accidentally perceiving ("collecting the data" of) copyrighted material. It is literally not possible for us. It IS possible for an AI tool because it is just that, a tool. It is used in specific situations and only fulfills its function when employed by a subject. It is therefore not dependent on constant data collection and processing just to exist as a self-contained sentient being without any other purpose. We are subjects with no purpose. And as soon as we use our perception and cognition ("data collection and data processing") for a purpose that could infringe on the right (e.g., property) of other people we DO get problems. If my inspiration comes too close to another person's original it is very well possible for me to face repercussions and reprimand. People who cite inspiration act as if we could just use "inspiration" as a constant free-out-of-jail card prior to AI when that's not the case. Second, "You can't copy style" itself is already questionable. I am pretty sure if I were to create commercial art that is based on one very distinct IPs I'm not sure if I wouldn't get legal problems. E.g., if I created something that looks like it could be 1:1 from Kim Possible or Spongebob Squarepants, or which copies Mickey Mouse's style 1:1 or if I were to draw creatures whose style is indistinguishable from Pokemon - all franchises that don't just have a "common" style like Superhero Comic style, Disney Princess style etc. - but whose styles are perceived as an integral part of a franchise itself. I can't speak with certainty of course but I would not be surprised if a big corporation could actually argue successfully that a very unique style was part of a distinct IP and therefore copyright protected. (I'm not saying I'd agree with this particular decision but it doesn't seem inconceivable to me.) In the end, artists are told that they are "unreasonably afraid" or "stuck in the past" because they don't want to be robbed of their own data for others to build and profit off of tools that are advancing so fast, that it is not unreasonable to assume they will soon be able to replace the very artists whose works they are trained with in certain domains. Demanding fairness is not being stuck in the past or being anti-tech. I'm sure most artists would feel very differently about art AI if it had been created under different circumstances: without living in a society that cares very little about art and artists and that is hell-bent on optimizing and automating everything economically with no regard for its impact on human. Reply Sebastian says: November 1, 2022 at 1:17pm This is a very long comment but at its core your argument is that because the original property is used as training input the artist must grant permission and be paid. If you, as a human, look at artwork and then create art inspired by it, is that not the same? Information flows from the artists brain onto their canvas into your eyes into your brain onto your canvas. One can argue that machine learning models are fundamentally different from humans and therefore different rules apply, but the information flow is comparable. Rational arguments can be made for your position and against it, it is not as clear cut as you argue. Adpah says: November 1, 2022 at 1:34pm @Sebastian "If you, as a human, look at artwork and then create art inspired by it, is that not the same? One can argue that machine learning models are fundamentally different from humans and therefore different rules apply, but the information flow is comparable." I have already laid out my argument why it is not the same: "First, what he did is not just "copying a style" or "being inspired" in the way a human artist is. He literally took the ! property! (actual image data) of other people to !build! a tool, a software, that needs said !actual data! as a !component! to be created. He can't use mere !abstract! concepts and ideas represented as thoughts in his brain to train and validate - to !build! - his model. Again, he needs actual, "tangible" property. He is dependent on other people's output, work results, in his own model building process. (If he wants to create a depiction in the !style! of Mickey Mouse (not Mickey Mouse himself) he will have to feed his AI model actual images of Mickey Mouse, image material created and licensed by Disney. He is still using "tangible" property (actual data) that is copyright protected in the creation of his model to imitate the Mickey Mouse style.)" And: "Sentient beings "process" data first and foremost for the ! self-contained! purpose of existing. Unfortunately, we need the input of our eyes and ears (and touch and smell etc.) to navigate this world, to !survive!. We !cannot! shut down our perceptive ("collecting data" with our eyes, ears etc.) and higher-order cognitive processes ("analyzing that data" to classify it as usable information and to derive meaningful conclusions from it) because we are accidentally perceiving ("collecting the data" of) copyrighted material. It is literally not possible for us. It !IS! possible for an AI tool because it is just that, a ! tool!. It is !used! in specific situations and only fulfills its function when employed by a subject. It is therefore not dependent on constant data collection and processing just to !exist! as a self-contained sentient being ! without any other purpose!. We are subjects with no purpose. But !as soon as we use our perception and cognition! ("data collection and data processing") !for a purpose! that could infringe on the right (e.g., property) of other people we DO get problems. If my inspiration comes too close to another person's original it is very well possible for me to face repercussions and reprimand. People who cite inspiration act as if we could just use "inspiration" as a constant free-out-of-jail card prior to AI when that's not the case." big red says: November 1, 2022 at 1:35pm Sebastian's reply is a classic one, but the Adpah already pointed out that human minds and AI are not the same. Training data is as fundamental to these programs as the code which reads the training data. People are willing to respect copyright (and pay for) that code -- why aren't they willing to do the same for the training data itself? Adpah says: November 1, 2022 at 1:46pm @Sebastian "This is a very long comment but at its core your argument is that because the original property is used as training input the artist must grant permission and be paid." Yes, just as you would need to in any other scenario where an image is used. Image data is not an abstract concept. It is a concrete entity or output that somebody has ownership over. And in case you want to argue "the image data is discarded" point again as to why this situation should be judged differently, I have already addressed that in the original comment, so I'm not going to repeat it again. Data is one of the most important components of AI. AI stands and falls with it. AI is a tool, and in a commercial context a product. That is why data used to build it should be treated as components of product development (commercially speaking) and not as a misplaced analogy of sentient human inspiration (which as mentioned above also has its boundaries and is not a free get-out-of-jail card for any "inspired" human creation). To argue that the creators of this crucial component of AI development should not have any say on it, any claim to their own work and output, in the context of commercially applied (art) AI model building - that everybody in the development chain can profit and decide - except those who provide the actual data - is not rational, it's just vile. David Sutherland says: November 1, 2022 at 1:47pm All seems very arbitrary. Going back to the original purpose of copyright, "To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries" I tend to lean towards a morality that supports the freedom to reproduce, to mimic and copy. But surely, others will want to stop one another from such mimicry when there's a protectionist racket to put keep dollars in one pocket and out of another. Seems somewhat arbitrary to say your tool can't duplicate aspects of a drawing but a human is fine to do so. Reply Matteo says: November 1, 2022 at 2:43pm As an artist and a habitual pessimist it's hard not to see the bleakness of these AIs. I know I'm being dramatic with the following opinion. All the creative jobs and hobbies of artist, musician, writer, etc, are going to be outsourced and given to these programs in the near future. What little space there is for making a living doing these things will shrink. Technology should be freeing us from tedium work, but instead it's entrapping us from the freedom to create, Reply Leave a Reply Cancel reply Your email address will not be published. Required fields are marked * [ ] [ ] [ ] [ ] [ ] [ ] [ ] Comment * [ ] Name * [ ] Email * [ ] Website [ ] [Post Comment] [ ] [ ] [ ] [ ] [ ] [ ] [ ] D[ ] Waxy.org | About