[HN Gopher] What does copyright say about generative models?
___________________________________________________________________
What does copyright say about generative models?
Author : BerislavLopac
Score : 31 points
Date : 2022-12-15 09:00 UTC (14 hours ago)
(HTM) web link (www.oreilly.com)
(TXT) w3m dump (www.oreilly.com)
| theGnuMe wrote:
| Isn't this just the riff sampling thing? So depending on the
| output you could be infringing.
|
| Like how Vanilla Ice "stole" Queen's bass line.
|
| https://en.wikipedia.org/wiki/Sampling_(music)
| dcow wrote:
| Which is ridiculous. You can't copyright 8 notes of music. This
| is the festering disease we need to rid ourselves of.
| lcnPylGDnU4H9OF wrote:
| Well, now I'm disappointed. I was going to point to this TED
| talk (https://www.ted.com/talks/damien_riehl_copyrighting_all
| _the_...) but it appears to have been removed as the video no
| longer loads for me. However, this (https://www.ted.com/talks
| /damien_riehl_why_all_melodies_shou...) appears to be an
| abridged version of the same talk.
|
| The disappointing bit: I say it's an abridged version because
| I distinctly remember him talking about how he actually
| claimed copyright for _all_ 8-note melodies under a
| permissive license, which I can 't find in the one that
| actually loads. He "brute forced" every 8-note melody with a
| program, saved them to a disk and claimed copyright under (I
| believe) the MIT license. (I can see legal issues with doing
| that, of course, so it's not hard to imagine possibly why the
| video was replaced with a different one.)
| an_ko wrote:
| Damien Riehl and Noah Rubin have all the melodies on a hard
| drive. (More info in this Adam Neely video:
| https://www.youtube.com/watch?v=sfXn_ecH5Rw and the TED talk
| linked in a nearby comment.) So depending on a court's kinda
| arbitrary definition of creativity on any given day, all of
| them may or may not be copyrighted already.
|
| An infringement being accidental also doesn't seem to stop
| copyright holders from successfully suing people. I think
| this means you could technically be successfully sued for
| dropping some bricks on a piano, since no matter where they
| land, the action constitutes a public performance of a
| copyrighted work. Fun stuff.
| dcow wrote:
| Which is most certainly not the original intention of
| copyright, if we are to even give the concept credence. I'm
| of the opinion that the entire system of western copyright
| _law_ is absolutely broken. You definitely _can 't_
| infringe on copyright by dropping bricks on a piano, in the
| world with a flourishing creative arts scene, anyway.
| Something went terribly wrong.
|
| PS: I also second the Adam Neely video. I was actually
| wondering if someone would link it (:
| kevingadd wrote:
| What's the threshold you _can_ copyright? 9 notes? 16? Why is
| that the threshold and not 8?
|
| I may not agree that it should be 8, but I don't see any
| rigorous or well-reasoned model here to explain why "you can
| copyright 8 notes" is a festering disease when it seems like
| people are not reasoning through why something should or
| shouldn't be copyrightable.
| geoelectric wrote:
| Robin Thicke wasn't exactly a sympathy-inspiring celebrity,
| but the Blurred Lines case was another egregious example.
|
| As someone who grew up in the 80s and 90s, I also really wish
| we'd gotten the explosion of almost completely sample-based
| music we were starting to see with bands like the Beastie
| Boys, the KLF, and Pop Will Eat Itself. The Biz Markie
| sampling case basically shitcanned an entire nascent
| subgenre.
| dcow wrote:
| Copyright as we know it is, simply put: entirely broken.
|
| There is no world imaginable where the concept of _owning_ ideas
| should be protected. Imagine if chefs had to license recipes...
|
| The arts communities need to take cues from the scientific
| community where _citations_ are the cultural norm. And as a
| society we need to figure out how to protect artists by
| protecting and celebrating the _expression_ of ideas. Not the
| ideas themselves.
| thewataccount wrote:
| > And as a society we need to figure out how to protect artists
| by protecting and celebrating the expression of ideas.
|
| Github copilot would (until they manually changed it) spit out
| the famous fast inverse square function word for word -
| comments and everything which is more than just the idea (why
| would you want the comments with them swearing?)
| dcow wrote:
| I'll admit, writing something down is where it gets very
| murky.
|
| On one side, you have music and musicians, where the written
| form is somewhat of meaningless tool used to produce creative
| expression, which is the performance of sound on an
| instrument.
|
| On the other hand you have authors (and recently, software
| engineers), where the creative expression as most understand
| it, is in the choice of arrangement of words (or statements).
|
| You can't really resolve the two. And so I lean towards a
| stance that it is fraught to try and build a flourishing
| creative society on the idea that simply writing an idea down
| (or saving a file of data) means you own it and can legally
| bring a case against someone else who happens to write the
| same idea down, be it music or a book.
|
| And then there's https://libraryofbabel.info, which contains
| all possible combinations of characters that every has been,
| or ever will be, written. Should that be copyrightable?
|
| Anyway, I am of the opinion that Copilot _does_ infringe on
| copyright _as we know it_. But I 'm also of the opinion that
| simply giving individuals universal claim to any written text
| they produce is problematic. And to answer your question,
| maybe I _do_ want the comments. Maybe they communicate
| something that the code cannot, swearing be damned.
|
| I'd prefer a culture of citations for written/stored/saved
| works. And a culture of celebrating performance of the arts,
| not storage of them.
| EMIRELADERO wrote:
| > There is no world imaginable where the concept of owning
| ideas should be protected.
|
| Copyright does not apply to ideas, it applies to specific
| expressions/implementations of those ideas, which is what it
| seeks to incentivize. The fact that many companies
| (particularly Disney) sued people who only copied the ideas and
| not the expressions shows a failure of the legal system, not
| copyright.
| dcow wrote:
| Why, then, is one artist even able to bring a case against
| another artist for composing a song with a similar riff as
| one they published? It seems it's not so simple. Perhaps I
| should have been more specific to say that _copyright law_ as
| it exists in society today is rather broken so as not to
| disparage the original intent of copyright as a concept. But,
| I feel like we 're just talking semantics, really. The point
| remains that the implementation is broken, copyright _as we
| know it_.
| kevingadd wrote:
| Because riffs aren't ideas, they're expressions of ideas.
| This seems pretty obvious? The concept of a riff is not
| copyrighted, but a specific one apparently can be.
| Obviously copyright overreach is a problem today, but
| you're attacking the wrong problem. Musical works can be
| plagiarized by other musicians and it _does happen_ ,
| sometimes by accident (due to modern technology and
| attribution problems), sometimes on purpose.
|
| The developer of a notable mobile game had to pull some
| music from their title a couple years ago because the
| composer they hired blatantly plagiarized some other works,
| for example.
| [deleted]
| [deleted]
| Guthur wrote:
| It should say that the notion of intellectual property is
| nonsensical.
| Rochus wrote:
| > _What was originally intended to protect artists has turned
| into a rent-seeking game in which artists who can afford lawyers
| monetize the creativity of artists who can't._
|
| It's rather a rent-seeking industry; the vast majority of artists
| benefit only marginally from copyright; an original intent, to
| give composers an income who would otherwise be out of the
| monetary loop, is long forgotten; instead, early on it was all
| about protecting the publishers' business by restricting copies;
| ironically, composers (or musicians in general) today earn best,
| on average, when they work as a clerk for a copyright collecting
| society; I don't think patent and copyright law can be fixed;
| they can only make it even more complicated and unwieldy, so that
| it gets even further away from composers and authors, and instead
| plays into the hands of trolls or monopolists.
|
| > _fixing copyright law to accommodate works used to train AI
| systems, and developing AI systems that respect the rights of the
| people who made the works on which their models were trained_
|
| And then also charge each student of art for studying original
| works with the intent to create new works based on what they have
| learned to make a living? This idea can be extended in any
| direction and quickly leads to a system that is rather against an
| open society where people benefit from each other.
| ROTMetro wrote:
| The system sure seems to have created a ton of music. I know
| many people who have benefited from being able to actually earn
| money from their work and would strongly disagree with you. I
| respect your opinion, but that is all you have written here,
| your opinion, not some great truth about copyright.
| Imnimo wrote:
| I'm not sure I buy this line of reasoning about "inputs" rather
| than "outputs". If there is some prohibition about using an image
| as an input, regardless of whether any vestige of that image
| exists in an output, doesn't that equally prohibit using an image
| to train neural network that just says whether an image is a cat
| or a dog? Or how about a network that just tries to denoise a
| photo from your camera?
| akira2501 wrote:
| > Copilot itself is a commercial product that is built a body of
| training data, even though it is completely different from that
| data. It's clearly "transformative."
|
| Is it, though?
|
| The article wrestles with the notion of the gap between "idea"
| and "expression." To me, I wonder if this is the same gap. The
| training data is equivalent to the "idea," and the output of
| using that training data in a particular way is the "expression."
|
| In this view, the result of your training isn't transformative,
| and it might not even something you can claim copyright over.
| What is it other than a particular arrangement of facts that have
| been feed into it? Merely adding weights in a highly dimensional
| space does not seem "transformative."
|
| This article feels like it's wrestling with the wrong side of the
| problem.
| avereveard wrote:
| You already get protection for characters and stories so it's not
| like ai will destroy publishing and it's not like existing
| content is not protected already.
|
| Copyright doesn't protect skill, as it doesn't protect
| algorithms, because these are tools to create and not finished
| products.
| seydor wrote:
| why does society and technology have to constantly adapt to
| ancillary legal requirements, while the laws themselves rarely
| adapt (e.g. never expire)?
| nonrandomstring wrote:
| Copyright vs. AI
|
| I'll grab some popcorn. This could be one of the all time epic
| battles.
|
| Earlier someone posted, and then deleted, an Ask HN: "Are artists
| fighting AI Art repeating Metallica versus Napster?"
|
| This is actually an entertaining question, because it brings in
| the power of the entertainments business.
|
| Where is Napster today? Didn't Metallica win that one? It might
| not be a great comparison, but what if RIAA, MPAA, Sony and the
| game industry decide that "generative AI" occupies the same
| threat space as "piracy"?
|
| It was of course Metallica and an army of ten thousand lawyers
| and goonies from a vast, wealthy, moribund industry that actually
| _did_ manage to block the road of progress and frighten the genie
| back into the bottle.
|
| In fact, the power of the film and music industry to shape
| technology has been so immense, you have to wonder whether they
| could do it again over "AI".
|
| Right now I think the entertainments industry is shitting itself
| over LLM technology, but is split over whether it can gain enough
| control to allow it on it's own terms, or mobilise to fight it.
| We haven't yet reached the stage of commodity proliferation. That
| will be the watershed.
| amelius wrote:
| > I'll grab some popcorn. This could be one of the all time
| epic battles.
|
| No, it will be boring, the outcome is clear: those with the
| deepest pockets will win this battle.
|
| Just look at Disney who had copyright law changed back in the
| previous century.
| geoelectric wrote:
| The artists didn't exactly win Metallica vs. Napster either.
|
| The end result was a compromise that still funneled some degree
| of money into the labels, with the artists getting an equally
| raw deal as before proportionally speaking--but now with a
| micro share of your $10/mo to Spotify or wherever instead of
| their share of $10+ for a single album.
|
| I'm not saying the prior business model was sustainable (at
| least ethically) but at the end of the day, "if you can't beat
| them, join them" is still one hell of a compromise to make.
| nonrandomstring wrote:
| Yes I think you're right. The technology was tamed and
| brought to heel. It starts out looking "disruptive". What
| will that look like when Big Media figure out how to take
| legal control of generative AI and become effective arbiters
| of all that can be cheaply, mechanically created?
| kmeisthax wrote:
| >But how much of a song or a painting can you reproduce?
|
| The reason why fair use is vague is specifically to confuse
| people who ask these kinds of questions. The Supreme Court needed
| a tool that artists could use to legally smack down people who
| republish fragments of other people's work, but didn't want to
| abolish the 1st Amendment in the process. So basically judges
| have the final say as to whether or not something is novel
| creativity or in debt to the original. Any hard-and-fast rule
| beyond "binding precedent applies" is effectively copyright
| abolition by degrees.
|
| >We lost most of Elizabethan theater because there was no
| copyright. [..] Without some kind of protection, authors had no
| interest in publishing at all, let alone publishing accurate
| texts.
|
| This is a dated example, if only because creative works leave a
| lot more evidence now than they used to. People today will act to
| preserve art _against the artists own wishes_ and at great
| personal risk.
|
| >and it's easy to suspect that the actual payments will be
| similar to the royalties musicians get from streaming services:
| microcents per use
|
| Given the amount of data these systems need (read: more than
| humanity can provide) I'd say microcents is arguably too high.
| Remember that you can't actually derive a clear chain of value
| between one particular training set entry and one particular
| execution of the model. It's all chucked into a blender that runs
| on almost-linear algebra and calculus. At best you can detect if
| parts of the image resemble specific training set examples[0] and
| pay people slightly more if the model regurgitates training set
| data.
|
| Let's also keep in mind that a good chunk of the licensing system
| is based on being able to say no to specific users, or write very
| tailor-made licensing agreements for specific works or
| conditions. That's still going to be threatened, even if we can
| pay sub-Spotify-tier royalties every time a model trains itself
| on your work.
|
| >It is easy to imagine an AI system that has been trained on the
| (many) Open Source and Creative Commons licenses.
|
| Working on it: https://github.com/kmeisthax/PD-Diffusion
|
| The thing is, we _already have_ a good database of reusable,
| public-domain, no-attribution-necessary images; it 's called
| Wikimedia Commons. I really can't fathom why OpenAI didn't start
| there, other than just an assumption that they were entitled to
| larger datasets or a feeling that they could get established
| before anyone sued.
|
| Even then, OpenAI already tried this with computer code and
| they're getting sued for it anyway, because they never bothered
| with attribution in the case of training set regurgitation.
|
| [0] This is possible because part of the prompt guidance process
| involves a thing called CLIP which can do both image and text
| classification in the same coordinate system.
| Nadya wrote:
| Just an FYI but your link 404's. I assume it is a private repo.
___________________________________________________________________
(page generated 2022-12-15 23:01 UTC)