[HN Gopher] Tool to convert copyrighted music into fair use
       ___________________________________________________________________
        
       Tool to convert copyrighted music into fair use
        
       Author : alphabet9000
       Score  : 180 points
       Date   : 2021-07-10 20:15 UTC (2 hours ago)
        
 (HTM) web link (fairuseify.ml)
 (TXT) w3m dump (fairuseify.ml)
        
       | lyxell wrote:
       | I remember that the guys behind The Pirate Bay actually made a
       | service like this back in the days. You would submit a song and
       | get a mashup back of cuts from other songs where each cut would
       | be short enough to fall under fair use. I can't find any
       | references to it online anymore though. Maybe someone else
       | remembers what the service was called.
        
         | rikkipitt wrote:
         | I don't remember the site, but it reminds me of the Girl Talk
         | album called "All Day" -
         | https://en.wikipedia.org/wiki/All_Day_(Girl_Talk_album). It was
         | originally released as a free digital download.
         | 
         | > Greg Gillis composed the album using overlapping samples of
         | 372 songs by other artists.
         | 
         | This article goes into it a bit more: "Girl Talk, Fair Use, and
         | Three Hundred Twenty-Two Reasons for Copyright Reform" -
         | https://jipel.law.nyu.edu/ledger-vol-1-no-1-4-pearl/
        
       | [deleted]
        
       | Black101 wrote:
       | He should not have faked the machine learning, but I like the
       | idea.
        
       | cjohansson wrote:
       | Hilarious stuff
        
       | not2b wrote:
       | I was kinda hoping it would change the song enough to get it past
       | Youtube's copyright filters, but apparently not.
        
       | konstruction wrote:
       | Hilarious :-)
        
       | er4hn wrote:
       | Finally, Copilot for Music!
        
         | laurent92 wrote:
         | Strangely, a lookalike of a music hit is nothing like the
         | original, and it's worth analyzing!
         | 
         | - Music is a vehicle for a common experience. Everyone knows
         | the next notes of some Lady Gaga song. We feel like learning
         | the lyrics will make us able to sing together if we were in a
         | club, and share something with other clubbers. Any AI who would
         | reproduce the voice and instruments would still not make you
         | feel like you are sharing a common moment with the rest of the
         | auditors,
         | 
         | - Hits are hits because we hear them a thousand times. It's
         | been proven that people don't necessarily like it the first
         | time. It's the familiarity with the song which make us like it
         | (or hate it when we've heard to too much).
         | 
         | - Even worse: We like some songs even more because we love the
         | author. Be it because they are politically involved, have a
         | cute face, has a nice life story, or seem to hide answers to
         | life in the lyrics of their work - But an AI producing the same
         | exact notes wouldn't trigger similar affection from us. It's
         | like hearing our kid singing: Very cute, but we wouldn't like
         | the same song by another kid. Audiences have a genuine
         | emotional attachment to the authors. It's especially visible
         | since the MCM revolution: Before MCM, music mattered; Now the
         | image matters way more, bands have a face, a graphic style, a
         | story to tell - and music could be as crap as possible, if we
         | like the band it can still have success. MCM changed music
         | forever, proving that AI can't replace that feeling.
         | 
         | Can it?
        
           | imwillofficial wrote:
           | I hear a lot of stated assumptions on how certain things
           | trigger emotional investment and other don't.
           | 
           | If you knew how manufactured the music industry was, and how
           | nothing of what you see of celebrities is true, it might as
           | well be AI plucking our heart strings, because it isn't
           | "real" in the sense that I think you mean, authentic human
           | connection over shared experience.
        
         | [deleted]
        
       | anderber wrote:
       | What are the criteria that this tool uses to determine something
       | to be fair use?
        
         | sumnole wrote:
         | It's a joke ragging on Github Copilot, which suggests to its
         | users code on github regardless of its copyright. The claim is
         | that any code written with Copilot does not infringe since it's
         | 'machine-generated code'. Github Copilot takes github code,
         | learns it and then feeds it to users based on prompts but you
         | can end up essentially copy and pasting an entire copyrighted
         | snippet. This satire site takes your uploaded mp3, 'learns it'
         | and hands you back the same mp3.
        
           | anderber wrote:
           | Ah, thank you for the explanation!
        
       | amelius wrote:
       | Nice try, but ... Co-pilot can be sued for copyright infringement
       | in specific cases. Therefore it doesn't mean you can get away
       | with copyright infringement if you copy Co-pilot's model.
        
       | habibur wrote:
       | I remember Web Font Player before dynamic fonts became available.
       | You could upload a copyrighted font. Say Microsoft or Apple's
       | font. It will trace that font and generate your "Web Font" and
       | then you could use it without any copyright issue as that's not
       | the original font rather a machine learnt one.
       | 
       | Guess fonts are still like this.
        
       | MeinBlutIstBlau wrote:
       | So having tried it out the song sounds...exactly the same. So
       | does this just make it that when it's played these detection
       | systems can't pick it up since it's somewhat different? Or if I
       | make a commercial product, include this version of the song, I
       | can somehow afford lawyers to defend myself when the music
       | industry sues me for using what sounds like the same song, just
       | with the 1's and 0's ordered a little differently?
       | 
       | Edit: was out of the loop on the joke...
        
         | detaro wrote:
         | It's a joke about GitHub Copilot.
        
         | [deleted]
        
         | abetusk wrote:
         | I think that's the joke. It literally takes the exact same song
         | unaltered but it says it's "using machine learning", "fair use"
         | etc. to give the pretense of it being legitimate.
         | 
         | This is most likely a commentary on GitHub co-pilot and how the
         | authors of this joke think that GitHub co-pilot is violating
         | copyright and does not fall under fair-use.
         | 
         | I just confirmed the "processed" file has the same SHA256 sum
         | as the original.
         | 
         | EDIT: I incorrectly labelled at Google co-pilot instead of
         | GitHub co-pilot. Fixed.
        
           | barbecue_sauce wrote:
           | Github Copilot.
        
           | [deleted]
        
         | laurent92 wrote:
         | The code source is setTimeout(..., random()). I'd say, even if
         | it takes long to build the neural network, it is very CPU
         | efficient.
        
       | sycren wrote:
       | By uploading the licensed music in the first place, are we not
       | breaching copyright law?
        
         | pornel wrote:
         | It's not uploading, it's making available for scraping.
        
         | Hamuko wrote:
         | Obviously machine learning is fair use, so it supersedes
         | copyright.
        
           | laurent92 wrote:
           | I sense a sleigh of bitterness in the programming community
           | after Copilot ;) It's the rumbling sound of the imagination
           | of a thousand people throwing the towel saying "What now".
        
             | Hamuko wrote:
             | Yeah, why would anyone be bitter about a giant corporation
             | creating a commercial code laundering machine that digests
             | a massive amount of copyrighted code and spits out "clean"
             | code free of all the burdens of its inputs?
        
       | wizzwizz4 wrote:
       | Wow. And it's entirely client-side, too! Impressive.
        
       | steelbrain wrote:
       | In case you don't get it, view the source :)
       | 
       | It's just a bunch of a sleep(random()) and visual changes on
       | viewport and you download the exact file you uploaded
        
         | londons_explore wrote:
         | I tried to play the original and downloaded music a bunch of
         | times to try to figure out any differences...
        
         | cblconfederate wrote:
         | The page code was written by co-pilot
        
       | gavinray wrote:
       | Damn, I read this as "Tool, the band, is converting all of their
       | copyrighted music into fair-use music." and got excited.
       | 
       | But this is funny too I guess
        
       | IshKebab wrote:
       | How do you go to this much effort to make a point without even
       | reading about how copyright and fair use works? There have been
       | multiple comments on HN and Reddit explaining how it doesn't work
       | like this.
        
       | [deleted]
        
       | ChristianGeek wrote:
       | Great way for the owner of the site to build up a library of free
       | music!
        
       | IceHegel wrote:
       | audiophiles gotta try this! it makes the music soo much better
        
       | speedgoose wrote:
       | Have you tried Github co-pilot? It's not going to copy paste the
       | Linux source code, like Dungeon AI is not going to copy paste a
       | Tolkien book.
        
         | andersource wrote:
         | Most of the time, but it _can_ , and I think that's the issue a
         | lot of people have with it.
         | 
         | https://twitter.com/mitsuhiko/status/1410886329924194309
         | 
         | https://news.ycombinator.com/item?id=27710287
        
           | zxzax wrote:
           | I don't understand why anyone has an issue with that. You
           | know what else can copy and paste code all the time? Humans.
           | But we have various ways of stopping employees who copy and
           | paste code from stackoverflow and github without checking the
           | license, so it's the same thing if you use one of these
           | tools. There's nothing new I can see here to be upset about.
           | 
           | This would be a lot more interesting if it showed the various
           | GPT-3 experiments at generating music and used that as a
           | point of comparison.
        
             | paulgb wrote:
             | > But we have various ways of stopping employees who copy
             | and paste code from stackoverflow and github without
             | checking the license
             | 
             | What would those be? I've worked at a number of
             | organizations that were (rightfully) paranoid about
             | accidentally incorporating GPL code, but even there I
             | wasn't aware of automated tooling to prevent it, it was
             | only enforced through developer vigilance.
        
               | zxzax wrote:
               | If you actually want a paid service, there are plagiarism
               | detectors like Fossa and Codequiry. Although in my
               | opinion, code review should be enough to catch any
               | "accidental" incidents of plagiarism, the differences in
               | writing style should make it very obvious when the
               | employee has copied something. That of course probably
               | won't apply if you suspect the employee is intentionally
               | changing it around to obfuscate the origin of the code,
               | but it seems that wouldn't be the case if they were just
               | committing the output straight from a neural net. But
               | automated scanners probably won't be able to catch those
               | well either -- the way to catch that would be to make
               | them do pair programming a lot.
        
               | reader_mode wrote:
               | >Although in my opinion, code review should be enough to
               | catch any "accidental" incidents of plagiarism, the
               | differences in writing style should make it very obvious
               | when the employee has copied something.
               | 
               | You must do some CSI level code reviews. Best I'm able to
               | do is figure out if code will work and if something can
               | be done obviously better. Stylistic calls (beyond lint
               | enforceable) are up to authors as far as I'm concerned.
               | 
               | And even then it's trivial to fix up naming schemes and
               | such to march codebase - doubt that gets you out of
               | copyright issues.
        
           | UncleMeat wrote:
           | This truly is the engineer's disease. Hundreds of incredibly
           | strong opinions about the legal system derived almost
           | entirely from a few tweets and zero experience outside of
           | software engineering.
           | 
           | Copilot is neat. If you are concerned about it, talk to a
           | lawyer and get their opinion.
        
           | speedgoose wrote:
           | This fast inverse square root function is very well known,
           | with even a Wikipedia page, and it is more than 20 years old.
           | My country doesn't have software patents but it seems that
           | the standard duration of a software patent is 20 years, so
           | even if this function was patented, the patent would have
           | expired by now.
        
             | lilyball wrote:
             | Copyright and patent are different. Also, while you can't
             | copyright an algorithm, your specific source code that
             | implements it is copyrighted (assuming it's sufficiently
             | original).
             | 
             | In this case it's not implementing the algorithm, it's
             | copying a particular famous implementation, down to the
             | comments.
        
             | NautilusWave wrote:
             | Copyright is different from patent. Copyright is
             | (basically) forever.
        
             | dublin wrote:
             | There is no real reason for copyright terms to exceed
             | patent terms.
             | 
             | (And FWIW, patent terms should be inversely proportional to
             | the number of patents issued in that category the previous
             | year. This would automatically reduce terms in categories
             | where innovation is rapid, promoting competition and drive
             | to get to market, but preserve maximum protection for
             | inventions in mature categories with a slower pace of
             | innovation.
        
       | nonbirithm wrote:
       | So when are neural nets trained on images or text going to be
       | confronted with the same copyright concerns? At the point that
       | GitHub has forced the issue into the spotlight with Copilot I
       | feel that it's only a matter of time before this reaches the
       | courts. Nobody seemed to care about copyright at the time people
       | were having fun creating AI dream collages or nonexistent anime
       | girls from a model trained on the Danbooru imageset. In the
       | latter case it's not clear that 100% of the original Pixiv and
       | Twitter creators gave their consent to have their work rehosted
       | on a different site in the first place, much less be involved in
       | ML experiments. That data was from 2018.
       | 
       | I'm almost tempted to believe that the people at GitHub knew this
       | was going to blow up as much as it did as some kind of a
       | challenge to the status quo of copyright and licensing, if only
       | so that everyone would start talking about the issue. Why did the
       | GitHub representative plainly state that Copilot was trained on
       | all of GitHub's codebase without seeming to care about the
       | pushback on Twitter and HN that was bound to happen as a result?
        
         | jiminymcmoogley wrote:
         | by the time the dinosaurs that dictate our laws begin to care
         | about it, copyright will no longer exist
        
       | tgv wrote:
       | Congratulations. That's got to be a 100% accurate algorithm.
        
       | hedora wrote:
       | Ooh. They have a DARPA grant! Applying now.
        
       | jjcon wrote:
       | When did the HN crowd become so defensive of copyright? I
       | understand the concerns on copilot but it's kinda weirding me
       | out.
        
         | throw0101a wrote:
         | > _When did the HN crowd become so defensive of copyright?_
         | 
         | Copyright is good in limited quantities. The current multi-
         | decade time horizon is probably what a lot of people are
         | against, and not the concept in general.
         | 
         | And limited time period seems to be consistent through history.
         | From the paper "Copyrights and Creativity: Evidence from
         | Italian Opera in the Napoleonic Age":
         | 
         | > _Comparing changes in the creation of new operas across
         | Italian states with and without copyrights, we show that the
         | adoption of basic copyrights encouraged the creation of new
         | work. Moreover, we find that copyrights changed the quality of
         | creative output by encouraging composers to produce more
         | popular and durable works. These results generalize to a
         | broader set of musical compositions and to librettos, as the
         | literary component to the score of operas. Based on these
         | findings, we conclude that the adoption of basic levels of
         | copyright protection - not exceeding the lifetime of the
         | composer - can help to raise both the quantity and the quality
         | of new creative works._
         | 
         | > _Importantly, we find that extensions in the length of
         | copyright beyond the composer's life did not encourage
         | creativity. Performance data reveal that few operas were played
         | after the first 20 years, which suggests that only the most
         | durable creative goods stand to gain from copyright
         | extensions._ [...]
         | 
         | * https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2505776
        
         | ReactiveJelly wrote:
         | Both the permissive and copyleft licenses are only enforcable
         | through copyright law.
         | 
         | I don't mind that copyright exists, I just wish it was better.
         | 
         | Also there's a power difference between individuals violating
         | the rights of a big company, and a big company violating the
         | rights of many individuals.
         | 
         | If Copilot isn't reined in, it feels like yet another case of
         | "The laws only apply to poor people".
        
         | carom wrote:
         | I guarantee this is not Microsoft's announcement that they are
         | forfeiting their copyrights. This is just them abusing the
         | spirit of ours.
        
         | hjek wrote:
         | Say your AGPL code is Copiloted into someone's new program and
         | they decide to release that under a non-free license; that's
         | the issue. We're defensive of _copyleft_.
        
         | aurelian15 wrote:
         | As weird as it may seem, you should not forget that free
         | software licenses are built upon the fabric of copyright.
         | Without copyright, free software could not exist in its current
         | form. For GPL-like "copyleft" licenses, there would be no way
         | to enforce that binary distributions of derived works are
         | accompanied by their source code. Similarly, in the context of
         | permissive BSD/MIT-style licenses, there would be no way to
         | enforce attribution.
         | 
         | So, given that FOSS---which a large portion of the HN crowd
         | depends on---cannot work without copyright (at least not in its
         | current form), the recent discussions may be less of a
         | surprise.
        
           | cortesoft wrote:
           | Maybe... although I personally think that the GPL and other
           | 'copy left' licenses aren't the reason open source has
           | prospered, nor do I think enforcing attribution really helps
           | the FOSS world that much.
           | 
           | People write and share code because it is useful to do that,
           | not because licenses require them to.
           | 
           | I think FOSS would do fine with no copyright, and in fact
           | more software might end up open source if we had ZERO
           | copyright... why not make your code open source and get back
           | contributions when your code would end up being shared
           | anyway?
        
             | dublin wrote:
             | There were other open source licenses at the party before
             | the GPL dropped its controversial "viral" turd in the
             | punchbowl - and many of them still exist nearly unmodified.
             | (e.g. BSD with attribution removed, etc.)
        
               | cortesoft wrote:
               | I know, but I am just questioning whether any license is
               | needed for open source to prosper.
               | 
               | I am positing that if licenses didn't exist, and anyone
               | could do anything they want with any bit of code they
               | see, open source would still prosper.
        
         | breck wrote:
         | "The heathen are sunk down in the pit that they made: in the
         | net which they hid is their own foot taken"
         | 
         | Copyright is a horrible system. Microsoft has been one of the
         | biggest proponents of that system. But now they've clearly
         | violated it. They should either join in abolishing it, or face
         | its consequences.
        
         | michaelmrose wrote:
         | Consider people's reaction to people selling boot leg DVDs vs
         | torrenting a movie. Although people may consider both morally
         | incorrect the corrupting profit motive results in the former
         | being seen far more negatively. In the current situation there
         | is also the matter that the Microsoft is still perceived
         | rightly I think very negatively and open source authors very
         | positively. Also in a David v Golliath situation nobody wants
         | to be seen rooting for the giant.
         | 
         | Personally I would be concerned about insert corp here
         | accidentally stealing code from an open source project then
         | years later going after the open source project for copyright
         | infringement regarding the code they in fact stole from the
         | open source project.
        
         | PaulKeeble wrote:
         | Because its my (and many of ours) code they have "learnt" from,
         | stripped the license and are intending to sell on. When we
         | listed code under MIT or GPL we meant those licenses, they
         | weren't random and Microsoft just seems to be completely
         | ignoring the reality of reproducing those works which are
         | covered by those licenses, they are making code private and
         | paid for that is open source. Not OK.
        
       | bobthebuilders wrote:
       | Using ddos-guard, does this sell my info to Russia?
        
         | imwillofficial wrote:
         | Isn't that service run out of a bunker in Norway or something?
         | I remember they were in the news for something recently.
        
       | sellyme wrote:
       | I've seen a lot of people ragging on Copilot for "copy+pasting"
       | code - does anyone have links to cases where it has done this
       | without the user intentionally trying to generate a specific
       | (extremely famous) code snippet?
       | 
       | I've seen tons of comments here and on Reddit that talk about
       | multiple instances of entire functions being copied verbatim, but
       | the only thing even remotely close to that I've seen is the fast
       | inverse square root, so I must have missed a few tweets or
       | something.
        
         | meibo wrote:
         | It only seems to if you give it no or very little "source"
         | input, like an empty file with a comment that says "// X
         | algorithm".
         | 
         | There's been a lot of bikeshedding on this, but GitHub
         | decidedly hasn't given enough information on how it works and
         | what the training dataset is, and the fair use question
         | definitely needs to be answered, maybe even in court - it's
         | just a matter of time.
        
           | Hamuko wrote:
           | > _what the training dataset is_
           | 
           | All non-private repositories on GitHub.
        
             | brutal_chaos_ wrote:
             | Is it though? I'd assume that too, but we don't really
             | know, do we? I mean to ask, what have Microsoft stated that
             | leads you to believe this? (Maybe there's a press release I
             | missed?)
        
               | teraflop wrote:
               | https://twitter.com/NoraDotCodes/status/14127413397714616
               | 35
        
         | IshKebab wrote:
         | Github did an analysis and found that it does do it, though
         | very rarely, and usually when it has little context (e.g. at
         | the start of a file). They're working on detecting those cases
         | though so it doesn't happen accidentally, so it is unlikely to
         | be a realistic problem.
        
         | dogecoinbase wrote:
         | It's happily spitting out licenses and copyright notices with
         | other people's names on them, it's pretty clearly half-baked.
        
       ___________________________________________________________________
       (page generated 2021-07-10 23:00 UTC)