Post AdhVFdM424cLLVX8YS by mnl@hachyderm.io
(DIR) More posts by mnl@hachyderm.io
(DIR) Post #AdhVFXKmPw1cfmt1ii by sofia@chaos.social
2024-01-10T10:27:43Z
0 likes, 0 repeats
i think to create generative #AI models based on public domain data would probably be a worthwhile experiment. but i think it's even more interesting philosophically for what it says about #copyright.first of all, there is a lot of data you can get with fairly little effort, the existing pool of public domain data first of all. next, a small fleet of camera drone can easily gather terabytes of pictures of nature and historical sites.
(DIR) Post #AdhVFZjfUN7c7YXvn6 by sofia@chaos.social
2024-01-10T10:30:18Z
0 likes, 0 repeats
next you can just pay artists to liberate their art or create new public domain art. same for journalism and science. how willing people are to do that is uncertain, some will be strictly opposed, to me it would be a dream come true. a unique collaborative effort, stacking shoulder on shoulder to build a giant. a giant that may expand the capabilities of billions of people (including me) in ways i can't even start to predict. and i might even get paid for it 🤑?
(DIR) Post #AdhVFbMhQjwBAtocGe by mnl@hachyderm.io
2024-01-10T10:35:44Z
0 likes, 0 repeats
@sofia yes. Calling for a lock down of training data based on copyright law would be a terrible thing for opensource models, imo. There's also a lot of research that goes into generated datasets and just curating leads to much better training results.
(DIR) Post #AdhVFc50lzNnOKY0Ku by sofia@chaos.social
2024-01-10T10:35:47Z
0 likes, 0 repeats
but what does copyright even allow us to create safely? huge amounts of today's artistic works are, by themselves, copyright violations. fanart, fan fiction, gameplay footage, unlicensed covers, mashups. the whole "Everything is a Remix" (watch it if you haven't) shebang.much of art is at the mercy of the big copyright holders, even if it's creators have never worked for them.
(DIR) Post #AdhVFdM424cLLVX8YS by mnl@hachyderm.io
2024-01-10T10:38:07Z
0 likes, 0 repeats
@sofia like i'm perfectly fine ethically with training my own model on the entirety of all papers in libgen... As long as I don't create the "replace a scientist machine". And while I take openai's stance with a pretty big grain of salt, they aren't (compared to midjourney, say), pushing for the "let's replace people" model, and provide tools to remove your self from the training corpus. FWIW, lip service does a fair amount here and is not in the headlines.
(DIR) Post #AdhVFdlEWUkabZJEmm by sofia@chaos.social
2024-01-10T10:41:32Z
0 likes, 0 repeats
but there's still fair use and quotation rights, right? surely you are free to quote a few seconds from a movie and make it part of your work. and you are free to liberate your work. but just how hard would it be to stitch small quotations together into larger ones. that seems a level or generalization that current AI should be largely capable of.so will the quotations of others restrict your own right to quote?
(DIR) Post #AdhVFezRx7iUPwy6aG by mnl@hachyderm.io
2024-01-10T10:39:47Z
1 likes, 0 repeats
@sofia i'm really taken aback by the copyright stan-isation of the countercultural debate here, as if the nytimes was a defender of the rights of individual journalists. No, if they were to train their own model on their journalists output they would go ahead, just like adobe and shutterstock and co.openai paying nytimes licensing fees won't change anything for the better.
(DIR) Post #AdhVFgn7Fx2U0HDHdY by sofia@chaos.social
2024-01-10T10:44:32Z
0 likes, 0 repeats
as long as copyright is upheld, i think it will boil down to this: you are not allowed to "significant" compete with the work of the copyright holder. what is "significant" varies wildly and arbitrarily, especially between different media and genres. your expression is restricted, so that the monopoly is protected. that is the what copyright has always been about.
(DIR) Post #AdhVFib8XSe3bhckF6 by sofia@chaos.social
2024-01-10T10:48:58Z
0 likes, 0 repeats
but surely you can avoid quoting copyrighted works? but how willing are you to give up your ability to describe them? 10 years ago, when i first dove deeper into these ideas, this seemed far fetched. programs that turn descriptions of just about anything into a pictures or videos, that would have sounded to me like AGI (turns it's not 😅). but at this point this seems uncontroversial: it is perfectly possible that descriptions translate into copyright violations.
(DIR) Post #AdhVFkcd0t1hsuKxqS by sofia@chaos.social
2024-01-10T10:50:21Z
0 likes, 0 repeats
in the end, what matters is not strings of bits, not analogue waveform or particular silver crystals on a film roll. what matters is resemblance, which happens in the brain. the resemblance can be created by file sharing, by re-encoding, with diffusion models or your own memory. it is the resemblance that is monopolized. copyright it is the enclosure of thoughts, nothing less.
(DIR) Post #AdhVJeCQplrd7jwSbw by lain@lain.com
2024-01-10T10:57:11.684344Z
0 likes, 0 repeats
@mnl @sofia > like i'm perfectly fine ethically with training my own model on the entirety of all papers in libgen... As long as I don't create the "replace a scientist machine". why the last part? seems that having a machine be able to do proper research would be amazing.
(DIR) Post #AdhW10ZlVzguyVX3AW by mnl@hachyderm.io
2024-01-10T10:59:23Z
0 likes, 0 repeats
@lain @sofia yeah i meant more like "let's fire research assistants and replace them with AI". Maybe not the best analogy here.
(DIR) Post #AdhW11PWPZ5jZ1kNqC by mnl@hachyderm.io
2024-01-10T11:00:49Z
0 likes, 0 repeats
@lain @sofia So much of the use-cases really depend on the context. For example, I build many "automate this job entirely", say, writing SEO titles. Because I work for a family business, that means that Joe can now, instead of grinding through 5000 SEO titles, do more product photography. But, I could also build "fire all content writers" machine, with the same prompt.
(DIR) Post #AdhW12BjWJejyYIszI by lain@lain.com
2024-01-10T11:05:18.704185Z
0 likes, 0 repeats
@mnl @sofia if you can actually generate content that's good enough with a machine, then i see no reason why a human should do it. Seems to me the same as using an electric dishwasher instead of hiring people to do it by hand.
(DIR) Post #AdhWLPYmKnXiFlncky by mnl@hachyderm.io
2024-01-10T11:06:11Z
0 likes, 0 repeats
@lain @sofia what about selling people on the concept that the machine could do it better, when it kind of can't? Or if it was the case that a human + the machine could do way better.
(DIR) Post #AdhWLQUCtHTp7sfUGm by lain@lain.com
2024-01-10T11:09:06.992845Z
0 likes, 0 repeats
@mnl @sofia i suspect this is mostly the case right now. I do use AI assistants for programming and they still aren't really that great, but they do help a lot with boring tasks and make me more productive.