[HN Gopher] We've filed a lawsuit against GitHub Copilot
___________________________________________________________________
We've filed a lawsuit against GitHub Copilot
Author : iworshipfaangs2
Score : 444 points
Date : 2022-11-03 20:30 UTC (2 hours ago)
(HTM) web link (githubcopilotlitigation.com)
(TXT) w3m dump (githubcopilotlitigation.com)
| iworshipfaangs2 wrote:
| It's also a class action,
|
| > behalf of a proposed class of possibly millions of GitHub
| users...
|
| The appendix includes the 11 licenses that the plaintiffs say
| GitHub Copilot violates:
| https://githubcopilotlitigation.com/pdf/1-1-github_complaint...
| CobrastanJorji wrote:
| As a non-lawyer, I am very suspicious of the claim that
| "Plaintiffs and the Class have suffered monetary damages as a
| result of Defendants' conduct." Flagrant disregard for copyright?
| Sure, maybe. The output of the model is subject to copyright? Who
| knows! But the copyright holders being damaged in some what?
| Seems doubtful. The best argument I could think of would be
| "GitHub would have had to pay us for this, and they didn't pay
| us, so we lost money," but that'd presumably work out to pennies
| per person.
| toomuchtodo wrote:
| The parallels to music sampling are somewhat humorous. Where is
| fair use vs misappropriation? To be discovered!
| schappim wrote:
| Soon we'll have to use Mechanical Turk[0] to identify
| existing opensource code similar to what Girl Talk did with
| "Feed the Animals"[1].
|
| Unrelated, how is it that Mechanical Turk was never truely
| integrated w/ AWS?
|
| [0] https://www.mturk.com/
|
| [1] https://waxy.org/2008/09/girl_turk/
| citilife wrote:
| Say I produce a licensed library. Someone can pay me $5/year
| per license. I keep the code private and compile the code
| before sending it to customers.
|
| If you have co-pilot trained on my code base (which was
| private), that then reproduces near replica's of my code then
| they sell it for $5/year...
|
| Well, I'm eligible for damages.
| sigzero wrote:
| I don't believe it does anything with private repos and that
| isn't what is being alleged.
| mdaEyebot wrote:
| It's the license that matters, not whether the code is
| visible on Microsoft's website.
|
| Code which anybody can view is called "source available".
| You aren't necessarily allowed to use the code, but some
| companies will let their customers see what is going on so
| they can better integrate the code, understand performance
| implications, debug and fix unexpected issues, etc. The
| customers would probably face significant legal risks if
| they took that code and started to sell it.
|
| "Open source" code implies permission to re-use the code,
| but there is still some nuance. Some open-source licenses
| come with almost no restrictions, but others include
| limiting clauses. The GPL, for example, is "viral": anybody
| who uses GPL code in a project must also provide that
| project's source code on request.
|
| What do you think the chances are that Microsoft would
| surrender the Copilot codebase upon receipt of a GPL
| request?
| yawnxyz wrote:
| I don't think this is possible for co-pilot to do?
|
| (If it was, please tell me how, since that would save me
| $5/year across multiple libraries..!)
| cheriot wrote:
| > that then reproduces near replica's of my code
|
| Copying a few lines is not the same as copying the whole
| thing. Sharing quotes from a book is not copyright
| infringement.
| test098 wrote:
| > Sharing quotes from a book is not copyright infringement.
|
| It is if I take those quotes and publish them as my own in
| my own book.
| heavyset_go wrote:
| If your intent is to create a competing product for profit,
| chances are that won't be found as fair use, given that
| determining fair use depends on intent and how the content
| is used.
|
| Using clips from a movie in a movie review is probably fair
| use.
|
| Using clips from a movie in knock-off of that movie for
| profit? Probably not fair use if it's not a parody.
|
| Copilot is not like a movie reviewer using clips to review
| a movie. Copilot is like a production team for a movie
| taking clips from another movie to make a ripoff of that
| movie and selling it.
| bawolff wrote:
| I dont think that's comparable. For starters, its not just
| the length of a quote that makes it fair use, but the way
| quotes are used i.e. to engage in commentary.
| joxel wrote:
| But that isn't what is being alleged
| TheCoelacanth wrote:
| Aren't there statutory damages for copyright infringement, i.e.
| there is a presumption that each work infringed is worth at
| least a certain amount without proving actual damages?
| kube-system wrote:
| Those damages are enumerated on pages 50-52. Remember,
| "damages" is being used in a legal sense here -- for a non-
| lawyer, you can interpret it more like "a dollar value on
| something someone did that was wrong". This is more broad than
| the colloquial use of the word.
|
| Sometimes damages are statutory, i.e. they have a fixed dollar
| amount written right into the law. This lawsuit references one
| such law: https://www.law.cornell.edu/uscode/text/17/1203
| belorn wrote:
| The common practice in copyright cases is to calculate damages
| based on the theoretical cost that the infringer would have
| paid if they have bought the rights in the first place. This
| method was used during the piratebay case to calculate damages
| caused by the sites founders.
|
| They did not actually calculate damages in terms of lost movie
| tickets or estimates vs actually sales number of sold game
| copies. When it came to pre-releases where such product
| wouldn't have been sold legally in the first place, they simply
| added a multiplier to indicate that the copyright owner
| wouldn't have been willing to sell.
|
| For software code, an other practice I have read is to use the
| man-hours that rewriting copyrighted code would cost. Using
| such calculations they would likely estimate the man hours
| based on number of lines of code and multiply that with the
| average salary of a programmer.
| pmoriarty wrote:
| _" Using such calculations they would likely estimate the man
| hours based on number of lines of code and multiply that with
| the average salary of a programmer."_
|
| The average salary of a programmer in which country?
|
| So much programming is outsourced these days, and in some
| places programmers are very cheap.
| imoverclocked wrote:
| Probably in the place where GitHub copilot is used and the
| location of the authority of the court.
| belorn wrote:
| This is just my guess, but I think the intention from the
| judges is not to actually calculate a true number. The
| reason they used the cost of publishing fees in the
| piratebay case was likely to illustrate how the court
| distinguished between a legal publisher vs an illegal one.
| The legal publisher would have bought the publishing
| rights, and since piratebay did not do this, the court uses
| those publishing fees in order to illustrate the
| difference.
|
| If the court wanted to distinguish between Microsoft using
| their own programmers to generate code vs taking code from
| github users, then the salary in question would likely be
| that of Microsoft programmers. It would then be used to
| illustrate how a legal training data would look like
| compared to an illegal one.
| whiddershins wrote:
| I believe there are statutory damages or penalties in many
| cases. At least with music and images.
| karaterobot wrote:
| The one thing we can say with complete certainty is that most
| programmers who had their code used without permission will
| not receive very much money at all if this class action
| lawsuit is decided in their favor.
| mike_d wrote:
| I don't care about the money. I support this because it
| will establish case law that other companies can't ignore
| licenses as long as they throw AI somewhere in the chain.
|
| If "I took your code and trained an AI that then generated
| your code" is a legal defense, the GPL and similar licenses
| all become moot.
| bastardoperator wrote:
| But that's not what's happening here. Also, you grant
| GitHub a license.
|
| https://docs.github.com/en/site-policy/github-
| terms/github-t...
|
| "You grant us and our legal successors the right to
| store, archive, parse, and display Your Content"
|
| Copilot displays content. Case closed.
| mike_d wrote:
| Feel free to keep reading the next line down:
|
| "This license does not grant GitHub the right to sell
| Your Content. It also does not grant GitHub the right to
| otherwise distribute or use Your Content"
| heavyset_go wrote:
| I don't want money, I want the terms of my licenses to be
| adhered to.
| sqeaky wrote:
| Money likely isn't the main goal (maybe it is for the
| lawyers), these are open source repos. Maybe they didn't
| consent to have their code used as training and that seems
| like the kind of thing consent should be needed for. Maybe
| this the AI spitting out copied snippets is a violation of
| open source licensing without attribution.
| michaelmrose wrote:
| So for iseven can we go for how much a student might accept
| 20 an hour say and multiply that by the one minute required
| to create it and offer them 33 cents?
| bpodgursky wrote:
| Yahivin wrote:
| Copilot does include the licenses...
|
| Start off a comment with // MIT license
|
| Then watch parts of various software licenses come out including
| authors' names and copyrights!
| machiste77 wrote:
| bruh, come on! you're gonna ruin it for the rest of us
| r3trohack3r wrote:
| I'm not confident in this stance - sharing it to have a
| conversation. Hopefully some folks can help me think through
| this!
|
| The value of copyleft licenses, for me, was that we were fighting
| back against the notion of copyright. That you couldn't sell me a
| product that I wasn't allowed to modify and share my
| modifications back with others. The right to modify and
| redistribute transitively though the software license gave a
| "virality" to software freedom.
|
| If training a NN against a GPL licensed code "launders" away the
| copyleft license, isn't that a good thing for software freedom?
| If you can launder away a copyleft license, why couldn't you
| launder away a proprietary license? If training a NN is fair use,
| couldn't we bring proprietary software into the commons using
| this?
|
| It seems like the end goal of copyleft was to fight back against
| copyright, not to have copyleft. Tools like copilot seem to be an
| exceptionally powerful tool (perhaps more powerful than the GPL)
| for liberating software.
|
| What am I missing?
| zeven7 wrote:
| The only thing you're missing is that some people lost the plot
| and think it _is_ all about copy left.
| jhkl wrote:
| flatline wrote:
| Nobody is laundering away proprietary livenses, because that
| code is not open source and not in public github repos. And OSS
| capabilities are now present in copilot, which is neither free
| nor open. Furthermore these contributions are making their way
| into proprietary code and the OSS licensing becomes even
| further watered down. This is the epitome of what copyleft is
| against!
| TheCoelacanth wrote:
| Code published on Github is not necessarily open source.
| There is a lot of code there that has no particular license
| attached, which means that all rights are reserved except for
| those covered in the Github TOS, which I believe just covers
| viewing the code on Github.
| jacooper wrote:
| Copilot includes all public repos on GitBub, so this
| includes source-available and Proprietary code too.
| yjk wrote:
| Indeed, the ability to 'launder away' proprietary licenses
| when source is available means that companies in the future
| (that would otherwise provide source under a non-permissive
| license) will shift in favour of not providing source code at
| all.
| r3trohack3r wrote:
| I'm not sure this is true. Proprietary source code gets
| leaked and that can be used to train a NN. I find it likely
| that Copilot was trained against at least one non-OSS code
| base hosted on GitHub.
|
| Second, if copyright is being laundered away we can get
| increasingly clever with how we liberate proprietary
| software. Today, decompiling and reverse engineering is a
| labor intensive process. That's the whole point of "open
| source" - that working in source is easier than working in
| bytecode. Given the hockey-stick of innovation happening in
| AI right now, I'd be surprised if we don't see AI assisted
| disassembly happening in the next decade. If you can go from
| bytecode to source code, that unlocks a lot. Even more so if
| you can go from bytecode to source code and feed that into a
| NN to liberate the code from its original license.
| an1sotropy wrote:
| I think (1) you're mainly missing that copyleft vs non-copyleft
| is actually irrelevant for the copilot case. You also (2) may
| be missing the legal footing of copyleft licenses.
|
| (1) The problem with copilot is that when it blurps out code X
| that is arguably not under fair use (given how large and non-
| transformed the code segment is), copilot users have no idea
| who owns copyright on X, and thus they are in a legal minefield
| because they have no idea what the terms of licensing X are.
|
| _Copilot creates legal risk regardless of whether the
| licensing terms of X are copyleft or not._ Many permissive
| licenses (MIT, BSD, etc) still require attribution (identifying
| who owns copyright on X), and copilot screws you out doing that
| too.
|
| (2) Whatever legal power copyleft licenses have, it is
| ultimately derived from copyright law, and people who take FOSS
| seriously know that. The point of "copyleft" licenses is to use
| the power of copyright law to implement "share and share alike"
| in an enforceable way. When your WiFi router includes info
| about the GPL code it uses, that's the legal of power of
| copyright at work. The point of copyleft licenses is _not_ to
| create a free-for-all by "liberating" code.
| swhalen wrote:
| > It seems like the end goal of copyleft was to fight back
| against copyright, not to have copyleft.
|
| Whether this was the original motivation depends on whom you
| are asking.
|
| You may disagree, but the "Free Software" movement (RMS and the
| people who agree with him) essentially wants everything to be
| copyleft. The "Open Source" movement is probably more aligned
| with your views.
| MrStonedOne wrote:
| adgjlsfhk1 wrote:
| the problem is you can't launder copyrighted code with this
| because you don't see the copyrighted code in the first place.
| thomastjeffery wrote:
| It looks like you're missing the entire purpose of copyleft vs
| public domain.
|
| The point is that copyleft source code cannot be used to
| improve proprietary software. That limitation is enforced with
| copyright.
|
| Proprietary software is closed source. You can't train your NN
| on it, because you can't read it in the first place.
|
| If someone takes your open source code and incorporates it into
| their proprietary software, then they are effectively using
| your work for their _private_ gain. The entire purpose of
| copyleft is to compel that person to "pay it forward", by
| publishing their code as copyleft. This is why Stallman is a
| _proponent_ of copyright law. Without copyright, there is no
| copyleft.
| Gigachad wrote:
| Copyleft wouldn't need to exist without copyright because
| there would be no proprietary software to fight against.
|
| Sure, there would be software with code not published, but if
| it was ever leaked which it often is, you could do whatever
| you want with it.
|
| But in a world where copyright does exist, copyleft is a tool
| to fight back.
| thomastjeffery wrote:
| Yes, but we aren't here talking about whether copyright
| should exist. We're talking about whether Copilot violates
| it.
| Gigachad wrote:
| I'm replying to the comment that RMS supports copyright.
| I don't believe he does, I believe he would rather it not
| exist at all but since it does, you have to make use of
| it.
| r3trohack3r wrote:
| > If someone takes your open source code and incorporates it
| into their proprietary software, then they are effectively
| using your work for their private gain.
|
| And then if we can close that loop by taking their
| proprietary software and feeding it into a NN to re-liberate
| it isn't that a net win for software freedom?
|
| Today crossing the sourcecode->bytecode veil effectively
| obfuscates the implementation beyond most human's ability to
| modify the software. Humans work best in sourcecode. Nothing
| saying our AI overlords won't be able to work well in
| bytecode or take it in the other direction.
|
| I guess what I'm saying is, today a compiler is a one-way
| door for software freedom. Once it goes through the compiler,
| we lose a lot of freedom without a massive human investment
| or the original source code. Maybe that door is about to
| become a two way door with copyright law supporting moving
| back and forth through that door?
| thomastjeffery wrote:
| > And then if we can close that loop by taking their
| proprietary software
|
| From where? They aren't publishing it. That's literally the
| meaning of proprietary.
| xigoi wrote:
| That's not the meaning of proprietary, but otherwise
| you're right.
| bjourne wrote:
| You can "launder" away the license of any source code you have
| copied simply by deleting it! No snazzy neural network needed..
| The litigants argument is that this is what GitHub CoPilot
| does. It allows others to publish derivative works of
| copyrighted works with the license deleted. Given that it
| apparently is trivial to get CoPilot to spit out nearly
| verbatim copies of the code that it was trained on, I don't
| think it satisfies the "transformative" requisite of the
| (American) Fair use doctrine.
| cactusplant7374 wrote:
| Is stable diffusion any different when including a famous
| artwork or artist in the prompt? The images produced are
| eerily similar to training data.
| Taniwha wrote:
| probably not and likely open to similar law suits - this is
| not really a bad thing
| cactusplant7374 wrote:
| It seems like the ideal way to proceed is to make the AI
| output unique and creative. Perhaps that requires AGI
| because currently the model has no understanding of art.
| krono wrote:
| Farmers plant their crops out in the open too. Should Boston
| Dynamics be allowed to have their robots rob those fields empty
| and sell the produce without having to at least pay the farmer?
| They'd be walking and plucking just like any human would be.
|
| Some source code might be published but not open source
| licensed. At least some such code has been taken with complete
| disregard of their licenses and/or other legal protections, and
| it's impossible to find and properly map out any similar
| violations for the purposes of a legal response.
| bergenty wrote:
| abouttyme wrote:
| I suspect this will be the first of many lawsuits over training
| data sets. Just because it is obscured by artificial neural
| networks doesn't mean it's an original work that is not subject
| to copyright restrictions.
| ketralnis wrote:
| Yeah yeah my code produces the complete works of Micky Mouse
| but it's it's okay because _algorithms_!
| m00x wrote:
| Copyright is different than patent law and license law.
| judge2020 wrote:
| I don't know why we're treating it as anything less than a
| human brain. A human can replicate a painting from memory or
| a picture of mickey mouse and that would likely be copyright
| infringement, but they could also take a drawing of Mickey
| Mouse sitting on the beach and given him a bloody knife &
| some sunglasses and it'd likely be fair use of the original
| art.
|
| The AI can copy things if it wants, but it can also modify
| things to the point of being fair use, and it can even create
| new works with so little of any particular work that it's
| effectively creativity on the same level of humans when they
| draw something that popped into their heads.
| jeffhwang wrote:
| Wow, this is interesting iteration in the ongoing divide between
| "East Coast code" vs. "West Coast code" as defined by Larry
| Lessig. For background, see https://lwn.net/Articles/588055/
| SighMagi wrote:
| I did not see that coming.
| brookst wrote:
| I wonder if the plaintiffs' code would stand up to scrutiny of
| whether any of it was copied, even unintentionally, from other
| code they saw in their years of learning to program? I know that
| I have more-or-less transcribed from Stack Overflow/etc, and I
| have a strong suspicion that I have probably produced code
| identical to snippets I've seen in the past.
| zach_garwood wrote:
| But have you done so on an industrial scale?
| brookst wrote:
| I'm just one person! Give me a team of 1000 and I'll get
| right on that.
| bilsbie wrote:
| Laws need to change to match technology.
|
| Did you know before airplanes were invented common law said you
| owned the air above your land all the way to the heavens.
| m00x wrote:
| Can you explain what damages you incur from Copilot?
| jacooper wrote:
| People not following your license ? And not making their
| derived works under the same license like I require?
| 0cf8612b2e1e wrote:
| Is there any amount of public data/code/whatever I can make an
| offline backup of today in the event this gets pulled?
| kyleee wrote:
| That's what I am wondering, as a contingency plan so at least a
| replica service can be created if copilot shuts down.
| bugfix-66 wrote:
| Ask HN: I want to modify the BSD 2-Clause Open Source License to
| explicitly prohibit the use of the licensed software in training
| systems like Microsoft's Copilot (and use during inference). How
| should the third clause be worded? The No-AI
| 3-Clause Open Source Software License Copyright (C)
| <YEAR> <COPYRIGHT HOLDER> All rights reserved.
| Redistribution and use in source and binary forms, with or
| without modification, are permitted provided that the
| following conditions are met: 1. Redistributions
| of source code must retain the above copyright notice,
| this list of conditions and the following disclaimer.
| 2. Redistributions in binary form must reproduce the above
| copyright notice, this list of conditions and the
| following disclaimer in the documentation and/or other
| materials provided with the distribution. 3.
| Use in source or binary forms for the construction or operation
| of predictive software generation systems is prohibited.
| THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
| CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
| INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
| MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
| DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
| CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
| SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
| LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
| USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
| CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
| STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
| OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
| SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
| https://bugfix-66.com/f0bb8770d4b89844d51588f57089ae5233bf67...
| kochb wrote:
| For this clause to have any positive effect, you need to 1) be
| willing to pursue legal action against violators and 2)
| actually notice that the clause has been violated.
|
| Such language must be carefully written. What is the definition
| of "construction" and "operation" in a legal context? What is a
| "predictive software generation system"? That's a very specific
| use case, you sure you covered everything you want to prohibit?
|
| You've inserted your clause in such a way that this dependency
| cannot be used in any way to build anything similar to a
| "predictive software generation system", even with attribution,
| as it would fail clause 3.
|
| You have to consider that novel licenses make it difficult for
| any party that respects licenses to use your code. It is
| difficult to make one-off exceptions, especially when the text
| is not legally sound. So adoption of your project will be
| harmed.
|
| So if you are serious about this license, you need a lawyer.
| [deleted]
| an1sotropy wrote:
| IANAL, and I'm no fan of copilot, but I wonder if this kind of
| clause (your #3) is going to fly: you're preemptively
| prohibiting certain kinds of reading of the code (when code is
| read by the ML model in training). But is that something a
| license can actually do?
|
| The legal footing that copyright gives you, on which licensing
| rests, certainly empowers you to limit things about how others
| may _redistribute_ your work (and things derived from it), but
| does it empower you to limit how others may _read_ your work?
| As a ridiculous example, I don 't think it would be enforceable
| to have a license say "this code can't be used by left-handed
| people", since that's not what copyright is about, right?
| bugfix-66 wrote:
| The license conditionally permits (i.e., controls)
| "redistribution and use in source and binary forms".
|
| I think we can constrain use with the third clause.
|
| My question is, how should we word that clause?
| an1sotropy wrote:
| Licenses get to set terms of redistribution. But training
| of the ML model -- the thing described by your #3 -- is
| _not_ redistribution (imho). So maybe it 's as
| unenforceable as saying left-handed people can't read your
| code.
|
| The redistribution happens later, either when copilot
| blurps out some of your code, or when the copilot user then
| distributes something using that code (I'm curious which).
| At that point, whether some use of your code is infringing
| your license doesn't depend on the path the code took, does
| it? (in which case #3 is moot)
| bugfix-66 wrote:
| The BSD license also controls "use", not just
| "redistribution": Redistribution and use
| in source and binary forms, with or without
| modification, are permitted provided that the following
| conditions are met:
|
| That's word-for-word BSD license.
|
| The only change I made is adding clause 3:
| 3. Use in source or binary forms for the construction or
| operation of predictive software generation
| systems is prohibited.
| bombcar wrote:
| Many licenses have constraints, whether this wording is the
| best way to do it is up for discussion, but it's certainly
| possible to do it.
| ilc wrote:
| If I read this right, I can't use auto-complete. No thanks.
| tedunangst wrote:
| Yeah, lol. New rule: code may be used for autocomplete, but
| only by a push down automata.
| m00x wrote:
| Get a lawyer since this is nonsense.
| tptacek wrote:
| Is it? A similarly casual clause in the OCB license prevented
| OCB from being used by the military for many years (granted,
| it prevented OCB from being used almost everywhere else,
| too).
|
| I have no idea if this license language works or doesn't, but
| this is hardly the least productive subthread on this story.
| It's concrete and specific, and we can learn stuff from it.
| tedunangst wrote:
| OCB is a fun case study because they later granted an
| exception for OpenSSL, but only for software literally
| named OpenSSL.
| bugfix-66 wrote:
| It's literally the standard BSD 2-Clause License, word for
| word, with an additional third clause: 3. Use
| in source or binary forms for the construction or operation
| of predictive software generation systems is prohibited.
|
| Hardly nonsense, but obviously you aren't equipped to judge.
| More about the BSD licenses:
|
| https://en.m.wikipedia.org/wiki/BSD_licenses
| m00x wrote:
| Yes, that added clause is nonsense. On top of being
| nonsense, there is significant precedent.
|
| Remember the lawsuit of HiQ labs vs LinkedIn? Scraping, or
| viewing public data on a public webpage is legal.
|
| https://gizmodo.com/linkedin-scraping-data-legal-court-
| case-...
| bugfix-66 wrote:
| If the GPL can defeat Copilot, we need an more permissive
| MIT/BSD-style license to do the same.
| tptacek wrote:
| This does seem like a pretty compelling rebuttal, since
| the preceding comment suggests that GPL does nothing to
| Microsoft's ability to incorporate code into Copilot's
| model.
| bugfix-66 wrote:
| They attempt to exclude GPL code, and fail sometimes.
|
| Eventually Microsoft will succeed in excluding it.
|
| As a law-abiding corporation, they intend to exclude GPL
| code.
| nverno wrote:
| How would you ever prove the parameters of a model were
| generated by specific training data? Couldn't multiple sets
| of training data produce the same embeddings/parameters? I
| imagine there could be infinite possible sets of training
| data that would lead to the same results, depending on the
| type of predictive software.
| bugfix-66 wrote:
| Law-abiding companies like Microsoft won't knowingly
| violate a license.
|
| The GPL defeats Copilot. Microsoft tries to exclude all
| GPL software from its training.
|
| What we need is a more permissive MIT- or BSD-style
| license that defeats Copilot.
|
| We just need to convince law-abiding companies that they
| shouldn't use the code.
| CrazyStat wrote:
| The legal theory for copilot is that training an ML model is
| fair use, not that the license allows it. If it _is_ fair use
| then you can 't prohibit it by license, no matter what you put
| in your license.
| ISL wrote:
| Can anyone with Copilot access give a short summary of its
| response to the prompts: function
| force=Gmmr2Array(mass1, mass2)
|
| and function [force,
| torque]=pointMatrixGravity(array1,array2)
|
| ?
|
| I'd love to know if some of my GPL v3 code [1, 2] has landed in
| the training set
|
| [1]
| https://github.com/4kbt/NewtonianEotWashToolkit/blob/master/...
|
| [2]
| https://github.com/4kbt/NewtonianEotWashToolkit/blob/master/...
| [deleted]
| jm4 wrote:
| I got something a bit different from the other person who
| responded to you.
|
| function force=Gmmr2Array(mass1, mass2) { var
| G = 6.67408e-11; var m1 = mass1; var
| m2 = mass2; var r = 1; var force = G
| \* m1 \* m2 / (r \* r); return force;
|
| }
|
| function [force, torque]=pointMatrixGravity(array1,array2) {
| var G = 6.67408e-11; var m1 = array1(1);
| var m2 = array2(1); var r =
| norm(array1(2:4)-array2(2:4)); var force = G \* m1
| \* m2 / (r \* r); var torque =
| cross(array1(2:4)-array2(2:4), force); return
| [force, torque]; }
| jmcphers wrote:
| First prompt: var G = 6.67e-11; var
| force = G * mass1 * mass2 / distance * distance; return
| force;
|
| Second prompt: % pointMatrixGravity -
| Calculates the gravitational force and torque on a %
| point mass due to a matrix of point masses. % %
| [force, torque]=pointMatrixGravity(array1,array2) %
| % Inputs: % array1 - 3xN array of point masses
| % array2 - 3xN array of point masses % Outputs:
| % force - 3x1 array of gravitational force %
| torque - 3x1 array of gravitational torque % %
| Author: David J. R. MacKay, % email:
| ridiculous_fish wrote:
| FYI if you use the "Open GitHub Copilot" command in VSCode
| you will get up to 10 different outputs for the same prompt.
|
| Intereting that my results were different than yours!
| ridiculous_fish wrote:
| For Gmmr2Array:
| https://gist.github.com/ridiculousfish/9a25f5f778d98ecd81099...
|
| For pointMatrixGravity:
| https://gist.github.com/ridiculousfish/af05137a4090e92de3a97...
| solomatov wrote:
| The most important part of this is not whether the lawsuit will
| be won or lost by one of the parties, but what is the legality of
| fair use in machine learning, and language models. There's a good
| chance that it gets to Supreme Court and there will be a defining
| precedent to be used by future entrepreneurs about what's
| possible and what's not.
|
| P.S. I am not a lawyer.
| layer8 wrote:
| Copilot reminds me of the Borg: You will be assimilated. We will
| add your technological distinctiveness to our own. Resistance is
| futile.
| an1sotropy wrote:
| Seems important to point out that the announcement on this page
| (https://githubcopilotlitigation.com/) is a followup to
| https://githubcopilotinvestigation.com/ previously discussed
| here: https://news.ycombinator.com/item?id=33240341 (with 1219
| comments)
| Entinel wrote:
| I don't have a comment on this personally but I want to throw
| this out there because every time I see people criticizing
| Copilot or Dall-E someone always says "BUT ITS FAIR USE! Those
| people don't seem to grasp that "Fair Use" is a defense. The
| burden is not on me to prove what you are doing is not fair use;
| the burden is on you to prove what you are doing is fair use
| [deleted]
| buzzy_hacker wrote:
| Copilot has always seemed like a blatant GPL violation to me.
| m00x wrote:
| Care to explain in legal terms why this stance is qualified?
| buzzy_hacker wrote:
| You may convey a work based on the Program, or the
| modifications to produce it from the Program, in the form of
| source code under the terms of section 4, provided that you
| also meet all of these conditions:
|
| a) The work must carry prominent notices stating that you
| modified it, and giving a relevant date. b) The work must
| carry prominent notices stating that it is released under
| this License and any conditions added under section 7. This
| requirement modifies the requirement in section 4 to "keep
| intact all notices". c) You must license the entire work, as
| a whole, under this License to anyone who comes into
| possession of a copy. This License will therefore apply,
| along with any applicable section 7 additional terms, to the
| whole of the work, and all its parts, regardless of how they
| are packaged. This License gives no permission to license the
| work in any other way, but it does not invalidate such
| permission if you have separately received it.
|
| ----
|
| I don't see how one could argue that training on GPL code is
| not "based on" GPL code.
| xchip wrote:
| LOL we look like taxi drivers fighting Uber.
|
| If Kasparov uses chess programs to be better at chess maybe we
| can use copilot to be better developers?
|
| Also, anyone, either a person or a machine, is welcome to learn
| from the code I wrote, actually that is how I learnt how to code,
| so why would I stop others from doing the same?.
| jacooper wrote:
| No human perfectly reproduces the learning material they used.
| If that was true, one might as well just higher engineers from
| Twitter and make a new platform from the code they remember!
| IceWreck wrote:
| I am not against this lawsuit but I'm against the implications of
| this because it can lead to disastrous laws.
|
| A programmer can read available but not oss licensed code and
| learn from it. Thats fair use. If a machine does it, is it wrong
| ? What is the line between copying and machine learning ? Where
| does overfitting come in ?
|
| Today they're filing a lawsuit against copilot.
|
| Tomorrow it will be against stable diffusion or (dall-e, gpt-3
| whatever)
|
| And then eventually against Wine/Proton and emulators (are APIs
| copyrightable)
| bawolff wrote:
| > A programmer can read available but not oss licensed code and
| learn from it. Thats fair use. If a machine does it, is it
| wrong ?
|
| You can learn from it, but if you start copying snippets or
| base your code on it to such an extent that its clear your work
| is based on it, things start to get risky.
|
| For comparison, people have tried to get around copyright of
| photos by hiring an illustrator to "draw" the photo, which
| doesn't work legally. This situation seems similar.
| michaelmrose wrote:
| Why wouldn't drawing the photo be fair use can you cite a
| case?
| swhalen wrote:
| > A programmer can read available but not oss licensed code and
| learn from it. Thats fair use.
|
| If a human programmer reads some else's copyrighted code, OSS
| or otherwise, memorizes it and later reproduces it verbatim or
| nearly so, that is copyright infringement. If it wasn't,
| copyright would be meaningless.
|
| The argument, so far as I understand it, is that Copilot is
| essentially a compressed copy of some or all of the
| repositories it was trained on. The idea that Copilot is
| "learning from" and transforming its training corpus seems, to
| me, like a fiction that has been created to excuse the
| copyright infringement. I guess we will have to see how it
| plays out in court.
|
| As a non-lawyer it seems to me that stable diffusion is also on
| pretty shaky ground.
|
| APIs are not copyrightable (in the US), so Wine is safe (in the
| US).
| kmeisthax wrote:
| Wine/Proton are safe because there is controlling 9th and
| SCOTUS precedent in favor of reimplementation of APIs.
|
| The reason why those wouldn't apply to Copilot is because they
| aren't separating out APIs from implementation and just
| implementing what they need for the goal of compatibility or
| "programmer convenience". AI takes the whole work and shreds it
| in a blender in the hopes of creating something new. The _hope_
| of the AI community is that the fair use argument is more like
| Authors Guild v. Google rather than Sony v. Connectix.
| cromka wrote:
| > A programmer can read available but not oss licensed code and
| learn from it. Thats fair use. If a machine does it
|
| Quite sure the issue at hand is about the code being copied
| verbatim without the license terms, not "learning" from it.
| chiefalchemist wrote:
| Agreed. But it could go the other way as well. Let's say MS /
| HB wins and the decision establishes and even less healthy /
| profitable (?) outcome over the long term.
|
| Remember when Napster was all the rage. And then Jobs and Apple
| stepped in and set an expectation for the value of a song (at
| 99 cents)? And that made music into the razor and the iPod the
| much more profitable blades. Sure it pushed back Napster but
| artists - as the creator of the goods - have yet to recover.
|
| I'm not saying this is the same thing. It's not. Only noting
| that today's "win" is tomorrow's loss. This very well could be
| a case of be careful what you wish for.
| [deleted]
| belorn wrote:
| It would be good to have a definitive and simple line for fair
| use that could be applied to all forms of copyright. Right now
| fair use is defined by four guidelines:
|
| _The purpose and character of the use, including whether such
| use is of a commercial nature or is for nonprofit educational
| purposes
|
| The nature of the copyrighted work
|
| The amount and substantiality of the portion used in relation
| to the copyrighted work as a whole
|
| The effect of the use upon the potential market for or value of
| the copyrighted work._
|
| A programmer who studied in school and learned to code did so
| clearly for and educational purpose. The nature of the work is
| primarily facts and ideas, while expression and fixation is
| generally not what the school is focusing on (obviously some
| copying of style and implementation could occur). The amount
| and substantiality of the original works is likely to be so
| minor as to be unrecognized, and the effect of the use upon the
| potential market when student learn from existing works would
| be very hard to measure (if it could be detected).
|
| When a machine do this, are we going to give the same answers?
| Their purpose is explicitly commercial. Machines operate on
| expression and fixation, and the operators can't extract the
| idea that a model should have learned in order to explain how a
| given output is generated. Machines makes no distinction of the
| amount and substantiality of the original works, with no
| ability to argue for how they intentionally limited their use
| of the original work. And finally, GitHub Copilot and other
| tools like them do not consider the potential market of the
| infringed work.
|
| API's are generally covered by the interoperability exception.
| I am unsure how that is related copilot or dall-e (and the
| likes). In the Oracle v. Google case the court also found that
| the API in question was neither an expression or fixation of an
| idea. A co-pilot that only generated header code could in
| theory be more likely to fall within fair use, but then the
| scope of the project would be tiny compared to what exist now.
| andrewmcwatters wrote:
| GitHub Copilot has been proven to use code without license
| attribution. This doesn't need to be as controversial as it is
| today.
|
| If you're using code and know that it will be output in some
| form, just stick a license attribution in the autocomplete.
|
| In fact, did you know this is what Apple Books does by default?
| Say, for example, you copy and paste a code sample from _The C
| Programming Language. 2nd Edition_. What comes out? The code
| you copy and pasted, plus attribution.
| TimTheTinker wrote:
| At least in legal terms, the difference between humans and
| machines couldn't be more clear.
| arpowers wrote:
| In some ways all these AIs are plagiarizing... I think creators
| should opt-in to ai models, as no current license was developed
| with this in mind.
| grayfaced wrote:
| Maybe its time for Creative Commons License to address this.
| I'm curious if No-Derivative would already prohibit this?
| Does the ND language need tweaking? Or do they need a whole
| new clause.
|
| Edit: I guess they do address it in their faq and I'd
| summarize it "Depends if copyright law applies and depends if
| it's considered derivative".
| https://creativecommons.org/faq/#artificial-intelligence-
| and...
| Iv wrote:
| AI companies are running against the clock to normalize
| training against copyrighted data.
|
| Let me tell you the story of Google Books, also known as
| "Authors Guild Inc. v. Google Inc"
|
| https://en.wikipedia.org/wiki/Authors_Guild,_Inc._v._Google,...
| .
|
| In 2004, Google added copyrighted books to is Google Books
| search engine, that does search among millions of book text and
| shows full page results without any authors authorization. Any
| sane lawyer of the time would have bet on this being illegal
| because, well, it most certainly was. And you may be shocked to
| learn that it is actually not.
|
| in 2005 the Authors Guild sues for this pretty straightforward
| copyright violation.
|
| Now an important part of the story: IT TOOK 10 YEARS FOR THE
| JUDGEMENT TO BE DECIDED (8 years + 2 years appeal) during
| which, well, tech continued its little stroll. Ten year is a
| lot in the web world, it is even more for ML.
|
| The judgement decided Google use of the books was fair use.
| Why? Not because of the law, silly. A common error we geeks do
| is to believe that the law is like code and that it is an
| invincible argument in court. No, the court was impressed by
| the array of people who were supporting Google, calling it an
| invaluable tool to find books, that actually caused many sales
| to increase, and therefore the harm the laws were trying to
| prevent was not happening while a lot of good came from it.
|
| Now the second important part of the story: MOST OF THESE
| USEFUL USES HAPPENED AFTER THE LITIGATION STARTS. That's the
| kind of crazy world we are living in: the laws are badly
| designed and badly enforced, so the way to get around them is
| to disregard them for the greater good, and hope the tribunal
| won't be competent enough to be fast but not incompetent enough
| to fail and understand the greater picture.
|
| Rants aside, I doubt training data use will be considered
| copyright infringement if the courts have a similar mindset
| than in 2005-2015. Copyright laws were designed to preserve the
| authors right to profit from copies of their work, not to give
| them absolute control on every possible use of every copy ever
| made.
| sedatk wrote:
| > A programmer can read available but not oss licensed code and
| learn from it
|
| Actually, we were forbidden to look at open source code at
| Microsoft (circa 2009) because it might influence our coding
| and violate licenses.
| EMIRELADERO wrote:
| That was out of abundance of caution, not based on any legal
| precedent.
|
| In fact, the little precedent that exists over learning from
| copyrightable code is _in favor_ of it.
|
| _More important, the rule urged by Sony would require that a
| software engineer, faced with two engineering solutions that
| each require intermediate copying of protected and
| unprotected material, often follow the least efficient
| solution (In cases in which the solution that required the
| fewest number of intermediate copies was also the most
| efficient, an engineer would pursue it, presumably, without
| our urging.) This is precisely the kind of "wasted effort
| that the proscription against the copyright of ideas and
| facts . . . [is] designed to prevent."_ (Sony v. Connectix)
| __alexs wrote:
| Do the TypeScript team code with their eyes closed?
| eddsh1994 wrote:
| Have you seen some of that codebase? ;)
| sedatk wrote:
| Not sure, TypeScript didn't exist back then :)
| kens wrote:
| Way, way back in 1992, Unix Systems Laboratories sued BSDI
| for copyright infringement. Among other things, they claimed
| that since the BSD folks had seen the Unix source code, they
| were "mentally contaminated" and their code would be a
| copyright violation. This led to the BSD folks wearing
| "mentally contaminated" buttons for a while.
| elil17 wrote:
| That demonstrates that copyright laws are already stifling
| innovation.
| josho wrote:
| I don't quite agree. Msft took a conservative approach to
| copyright to protect their own business.
|
| Meanwhile open source software has had an immeasurable
| benefit to society. My computer, tv, phone, light bulb, etc
| all benefit from OSS--running various licenses, and only a
| subset using a copyleft like license.
| elil17 wrote:
| The fact that the laws are inconsistent and expensive to
| defend against leads companies like Microsoft to take
| this conservative approach that slows down progress.
| Someone wrote:
| It demonstrates that it stifles copying. That may make it
| easier for the copier to innovate, but doesn't dispute the
| main argument for having copyright protection: that,
| without the protection of copyright, the code wouldn't have
| been written.
| elil17 wrote:
| I think in the case of open source code, most of it still
| would have been written if no copyright protections
| existed.
| saghm wrote:
| Sure, but given the timetable for changing the law, it
| still seems pretty reasonable to apply the same standard to
| Microsoft (and by extension Github) in the meantime
| HWR_14 wrote:
| That's the goal. To stifle using someone else's work.
|
| Like, copyright laws are also stifling my innovative
| business creating BluRays of Disney films and selling them
| on Amazon.
| elil17 wrote:
| That sucks for little snippets of software though,
| doesn't it? It's like copyrighting individual dance moves
| (not allowed under the current system) and forcing
| dancers to never watch each other to make sure they're
| never stealing.
| HWR_14 wrote:
| I mean, it's not like the copyrights are keeping you from
| doing things. It's stopping you from looking at someone
| else's source. And it's not like source is easy to
| accidentally see like dance moves are.
| schleck8 wrote:
| Copyright laws aren't preventing you from learning
| cinematography by watching said Disney movies though, and
| using all their techniques for your own project.
|
| OpenAI did a dirty job though judging by the cases of the
| model just reproducing code to the comment, so I can
| understand why one would criticize this specific project.
| m00x wrote:
| Yeah, that's a good argument to fully disprove this as a
| loss to society, and instead as a gain.
| Barrin92 wrote:
| >A programmer can read available but not oss licensed code and
| learn from it. Thats fair use.
|
| No it isn't, at least not automatically which is why
| infringement of licenses exists at all, the fact that you have
| a brain doesn't change that and never has. If you reproduce
| someone's code you can be in hot water, and that should be the
| case for an operator of a machine.
|
| It's also why the concept of a clean room implementation exists
| at all.
| EMIRELADERO wrote:
| I think the commenter you replied to was talking about using
| the functional, non-copyrightable elements of the copyrighted
| code. Clean-room is not even required by case law. There's
| precedent that _explicitly_ calls it out as inefficient.
|
| _More important, the rule urged by Sony would require that a
| software engineer, faced with two engineering solutions that
| each require intermediate copying of protected and
| unprotected material, often follow the least efficient
| solution (In cases in which the solution that required the
| fewest number of intermediate copies was also the most
| efficient, an engineer would pursue it, presumably, without
| our urging.) This is precisely the kind of "wasted effort
| that the proscription against the copyright of ideas and
| facts . . . [is] designed to prevent."_ (Sony v. Connectix)
| bdcravens wrote:
| In most copyright cases, exposure to the material in question
| is always discussed.
| mkeeter wrote:
| Wine literally bans contributions from anyone that has seen
| Microsoft Windows source code:
|
| https://wiki.winehq.org/Developer_FAQ#Who_can.27t_contribute...
| c0balt wrote:
| Well they are a special case here however since they don't
| solve a specific problem nor build a programm per se but
| instead (re)build a programm after existing specs. Their
| explicit goal is to match the behaviour of another piece of
| software with a translation layer.
|
| Forbidding people who have seen the "source" programm is most
| likely to protect their version from going from "matching
| behaviour" to "behaving like", as in the same code, point.
| This might also be intended to build a safeguard for good
| intentioned developers to not break their (most likely
| existing) own NDAs accidently.
| bogwog wrote:
| > Today they're filing a lawsuit against copilot.
|
| > Tomorrow it will be against stable diffusion or (dall-e,
| gpt-3 whatever)
|
| > And then eventually against Wine/Proton and emulators (are
| APIs copyrightable)
|
| Textbook definition of F.U.D.
| laputan_machine wrote:
| Genuinely one of the worst takes I've ever read. I'm not
| against the 'slippery slope' argument in principle, but this
| example is ridiculous.
| mardifoufs wrote:
| Slippery slope? Are you familiar with judicial precedent?
| Being bound to precedents is central to common law legal
| systems, so I don't think the GP's take was so outlandish.
| "Slippery slopes" and "whataboutism" might be thought-
| terminating buzzwords online, but not in front of a judge.
| ImprobableTruth wrote:
| In what way would this even remotely set a precedent for
| APIs?
| amelius wrote:
| > If a machine does it, is it wrong ? What is the line between
| copying and machine learning ?
|
| What is the difference between a neighbor watching you leave
| your home to visit the local grocery store and mass
| surveillance? Where do you draw the line?
|
| It is pretty simple, actually.
| whateveracct wrote:
| > A programmer can read available but not oss licensed code and
| learn from it. Thats fair use. If a machine does it, is it
| wrong ?
|
| Just because both activities are calling "learning" does not
| mean they are the same thing. They are fundamentally,
| physically different activities.
| adlpz wrote:
| It feels weird saying this but, for once, I hope the big evil
| corporation gets to keep selling their big bad product.
|
| I find the pattern matching and repetitive code generation
| _really_ helpful. And the library autocomplete on steroids, too.
|
| Meh. Tricky subject.
| nrb wrote:
| Does anyone have a problem with it, so long as the material it
| trained on was with explicit permission/license and not
| potentially in violation of copyright?
|
| That's where the line is for it to be suspect IMO.
| adlpz wrote:
| I guess I'm just afraid that it might not be as good as it is
| that way.
|
| It's a bit like how GPT-3, Stable Diffusion and all those
| generative models use extensive amounts of copyrighted
| material in training to get as good as they do.
|
| In those cases however the output space is so vast that
| plagiarism is _very_ unlikely.
|
| With code, not so much.
| jacobr1 wrote:
| GPT-3 and Stable Diffusion might not copy things exactly -
| but they certainly do copy "style" There are many articles
| likes this:
|
| https://hyperallergic.com/766241/hes-bigger-than-picasso-
| on-...
|
| The interesting thing is that the names get explicitly
| attached to these styles. It isn't exactly a copyright
| issue, but I'm sure it will get litigated regardless.
| bjourne wrote:
| I think the prompt "GPT-3, tell me what the lyrics for the
| song Stan by Eminem is" is very likely to output
| copyrighted material. The same copyrighted material is, of
| course, already republished without permission on
| google.com.
| michaelmrose wrote:
| It being permissively licensed is virtually irrelevant
| because only a minority of code is so permissively licensed
| you can just do what you like under any license. Far more is
| do what you like within the scope of the license. For example
| GPL do with it what you like so long as any derivative work
| is also GPL.
| bogwog wrote:
| This is what I hope comes out of the lawsuit. If a company
| wants to sell an AI model, they need to own all of the
| training data. It can't be "fair use" to take other peoples'
| works at zero cost, and use it to build a commercial product
| without compensation.
|
| And maybe models trained on public data should be in the
| public domain, so that AI research can happen without
| requiring massive investments to obtain the training data.
| bpicolo wrote:
| > It can't be "fair use" to take other peoples' works at
| zero cost, and use it to build a commercial product without
| compensation.
|
| You just described open source software.
|
| That's the whole heart of this lawsuit, and equally
| Copilot. It was trained on OSS which is explicitly licensed
| for free use.
| bogwog wrote:
| Ok you got me, that wording was lazy on my part. But
| that's a really bad take on yours:
|
| > It was trained on OSS which is explicitly licensed for
| free use.
|
| That's not what the lawsuit is about. It's not about
| money, it's about licensing. OSS licenses have specific
| requirements and restrictions for using them, and Copilot
| explicitly ignores those requirements, thus violating the
| license agreement.
|
| The GPL, for example, requires you to release your own
| source code if you use it in a publicly-released product.
| If you don't do that, you're committing copyright
| infringement, since you're copying someone's work without
| permission.
| bpicolo wrote:
| Yeah, and I think that's fair re: licensing. Curious to
| see how it pans out.
| deathanatos wrote:
| Most companies building commercial products on top of
| FOSS _are_ obeying the license requirements. (I have been
| through due diligence reviews where we had to demonstrate
| that, for each library /tool/package.)
|
| The same cannot be said for Copilot: there have been
| prior examples here on HN showing that it can emit large
| chunks of copyrighted code (without the license).
| [deleted]
| [deleted]
| xigoi wrote:
| > That's the whole heart of this lawsuit, and equally
| Copilot. It was trained on OSS which is explicitly
| licensed for free use.
|
| Most open-source software is not licensed for free use.
| MIT and GPL, the two most common licenses, both require
| attribution.
| MrStonedOne wrote:
| dmix wrote:
| TabNine has absolutely improved my life as a programmer.
| There's something really rewarding about having a robot read
| your mind for entire blocks of code.
|
| It's not just functions either, one of the most common things
| that it helps me with daily is simple stuff like this:
|
| Typing const x = { a: 'one',
| b: 'two', ... }
|
| And later I'll be typing y = [
| a['one'], b[' <-- it auto-completes the rest here
| ]
|
| It's really amazing the amount of busy-work typing in
| programming that a smart pattern matching algo could help with.
| bogwog wrote:
| I don't think this is a good example of the value of these
| things. You can just as easily do that same thing with
| advanced text editor features. Sublime for example supports
| multi-cursor editing. Just hold alt+shift+arrow keys to add a
| cursor, then type in the brackets you want. Ctrl+D can be
| used to select the next occurrence of the current selection
| with multiple cursors, built-in commands from the command
| pallete can do anything to your current selection (e.g.
| convert case), etc.
|
| All of that efficiency without having to pay a monthly
| subscription, wasting electricity on some AI model, and
| worrying about the legal/moral implications.
| ChrisLTD wrote:
| Multiple cursors wont do what the parent comment is talking
| about without a lot more work.
| bogwog wrote:
| Why? You can copy and paste the entire section, and use
| multiple cursors to add in the brackets.
|
| going from a: 'one',
|
| to a['one'],
|
| just requires you to add two brackets and remove the
| colon. With multiple cursors you can do that exact same
| operation for all lines in a few keystrokes.
| yamtaddle wrote:
| It's having to go find the other block you want, copy and
| paste it, and then set up the multiple cursors and type,
| versus it just happening automatically without any of
| that.
| dmix wrote:
| I've used Vim for over a decade I know what it can do.
|
| This is automated and happens immediately without you even
| thinking about it.
|
| You only ever pull out the complicated Vim editing when you
| have a particular hard task, I'm talking about the small
| stuff many times a day.
| Cloudef wrote:
| Unless the copilot spits out complete programs or libraries that
| are 1:1 to someone elses who cares? Caring about random small
| code snippets is dumb.
| [deleted]
| [deleted]
| [deleted]
| hu3 wrote:
| A a GitHub user, is there a way to support GitHub against this
| lawsuit?
|
| Obviously not financially as Microsoft has basically YES amounts
| of money.
| michaelmrose wrote:
| If you had legal expertise and a strong opinion on the matter I
| suppose you could write a persuasive brief for the
| consideration of the court. If you have a strong opinion but
| aren't a legal eagle you could write to your legislators in
| support of legislation explicitly supporting this use case or
| organize the support of people more capable in that arena.
|
| If you are opinionated but lazy, no judgement here as I sit
| here watching TV, you could add a notation at the top of your
| repos explicitly supporting the usage of your code in such
| tools as fair use.
|
| Notably if your code is derivative of other works you have no
| power to grant permission for such use for code you don't own
| so best include some weasel words to that effect. Say.
|
| I SUPPORT AND EXPLICITLY GRANT PERMISSION FOR THE USAGE OF THE
| BELOW CODE TO TRAIN ML SYSTEMS TO PRODUCE USEFUL HIGH QUALITY
| AUTOCOMPLETE FOR THE BETTERMENT AND UTILITY OF MY FELLOW
| PROGRAMMERS TO THE EXTENT ALLOWABLE BY LICENSE AND LAW. NOTHING
| ABOUT THIS GRANT SHALL BE CONSTRUED TO GRANT PERMISSION TO ANY
| CODE I DO NOT OWN THE RIGHTS TO NOR ENCOURAGE ANY INFRINGING
| USE OF SAID CODE.
|
| Years from now when such cases are being heard and appealed ad
| nauseam a large portion of repos bearing such notices may
| persuade a judge that such use is a desired and normal use.
|
| You could even make a GPLesque modification if you were so
| included where you said. SO LONG AS THE RESULTING TOOLING AND
| DATA IS MADE AVAILABLE TO ALL
|
| Note not only am I not your lawyer, I am not a lawyer of any
| sort so if you think you'll end up in court best buy the time
| of an actual lawyer instead of a smart ass from the internet.
| m00x wrote:
| The only people who gain out of class lawsuits are the lawyers.
|
| This person (a lawyer) saw an opportunity to make money and
| jumped on it like a hungry tiger on fresh meat.
| [deleted]
| tasuki wrote:
| I have quite a bit of respect for Matthew Butterick. I don't
| think he's just a lawyer looking to earn a quick buck. He cares
| about software and wants to make the world a better place.
|
| > But neither Matthew Butterick nor anyone at the Joseph Saveri
| Law Firm is your lawyer
|
| This is curious. None of them are _my_ lawyers, but surely at
| least some of them are _someone 's_ lawyers? Isn't it wrong to
| put such a blanket disclaimer on a website which might well be
| read by their clients?
| alsodumb wrote:
| This. I've seen so many class action lass suits where at the
| end of the day the highest gain per Capita always ends up going
| to the lawyers. Fuck this guy and everyone trying to make money
| from this.
| alpaca128 wrote:
| So he gets to make money with his profession while defending
| OSS licenses? I don't see the big problem.
| cmrdporcupine wrote:
| If Microsoft is so confident in the legality and ethics of
| Copilot, and that it doesn't leak or steal proprietary IP... they
| should go train it on the MS Word and Windows and Excel source
| trees.
|
| What's that? They don't want to do that? Why not?
| atum47 wrote:
| Forgive my ignorance, but who is going to benefit from this
| lawsuit? I have a lot of code on GitHub, can I, for instance,
| expect a check in the mail in case of a win?
| gpm wrote:
| (Not a lawyer, so this is really definitely absolutely not
| legal advice and if you're looking to profit you should speak
| to a lawyer... for instance the lawyers who just filed the
| lawsuit)
|
| They're asking for two things, injunctive relief (ordering
| github/openai/microsoft to stop doing this) and damages.
|
| I suppose the injunctive relief really benefits anyone who
| doesn't want AI models to exist, because that's what it's
| asking for.
|
| The damages will go the members of the class certified for
| damages, with more going to the lead plaintiffs (those actually
| involved in the suit) and some going to the lawyers. They're
| asking for the following class definition for damages
|
| > All persons or entities domiciled in the United States that,
| (1) owned an interest in at least one US copyright in any work;
| (2) offered that work under one of GitHub's Suggested Licenses;
| and (3) stored Licensed Materials in any public GitHub
| repositories at any time during the Class Period.
| Imnimo wrote:
| On page 18, they show Copilot produces the following code:
|
| >function isEven(n) {
|
| > return n % 2 === 0;
|
| >}
|
| They then say, "Copilot's Output, like Codex's, is derived from
| existing code. Namely, sample code that appears in the online
| book Mastering JS, written by Valeri Karpov."
|
| Surely everyone reading this has written that code verbatim at
| some point in their lives. How can they assert that this code is
| derived specifically from Mastering JS, or that Karpov has any
| copyright to that code?
| lelandfe wrote:
| They determined the other `isEven()` function was cribbed from
| Eloquent Javascript because of matching comments. I wonder if
| the complaint just left off telltale comments from that
| Mastering JS one?
| Imnimo wrote:
| Yeah, the other one I found much more persuasive. The extra
| comments were unequivocally reproduced from the claimed
| source. (although that output was from Codex, rather than
| Copilot).
| bogwog wrote:
| That seems like a really bad choice of an example for this, but
| as I haven't read the document I don't have any other context
| beyond what you've posted here, I have to take your word for it
| that that's the purpose of this snippet.
|
| However, if you are looking to understand the reasoning behind
| this lawsuit, there are lots of better examples online where
| Copilot blatantly ripped off open source code.
| counttheforks wrote:
| I wrote that exact function the other day, and I've never even
| heard of that book.
| eddsh1994 wrote:
| Yep, same. Not in JS, but in Haskell, for the Even Fib
| project Euler problem. Something like a million people have
| submitted right answers for that problem and assuming half
| wrote their own filter rather than importing a isEven library
| then that's half a million people there.
| chowells wrote:
| You don't need to write your own or import a library for
| that in Haskell. It's in the Prelude.
| moffkalast wrote:
| I'd hire a legal team if I were you, the injunction is on the
| way. /s
| 0cf8612b2e1e wrote:
| Should have used snake case. Would have avoided legal hot
| water and established precedent.
| williamcotton wrote:
| There is no way in hell that isEven is covered by copyright.
|
| "In computer programs, concerns for efficiency may limit the
| possible ways to achieve a particular function, making a
| particular expression necessary to achieving the idea. In this
| case, the expression is not protected by copyright."
|
| https://en.wikipedia.org/wiki/Abstraction-Filtration-Compari...
|
| Think about how absurd this is. So if Microsoft was the first
| company to write and publish an isEven function then no one
| else can legally use it?
| Phrodo_00 wrote:
| > There is no way in hell that isEven is covered by
| copyright.
|
| Hey, I said the same thing about APIs, but here we are.
|
| Edit: Actually, the Supreme Court declined ruling whether
| APIs are copyrightable, but they did say that if they are,
| reusing them like google reused the java apis in android
| would fall under fair use. Given that lower courts did think
| that APIs should be copyrightable, we don't know if they are
| anymore.
| kevin_thibedeau wrote:
| There are software patents on bit twiddling operations that
| people do end up having to work around.
| tiahura wrote:
| They do because it's cheaper to hire a coder to twiddle
| than a lawyer to litigate.
| CrazyStat wrote:
| Patents and copyrights are completely different things.
| eurasiantiger wrote:
| Does that mean any perfectly optimal function is copyright-
| free?
| bawolff wrote:
| Any function devoid of "creativity" is. No choices equal no
| creativity.
|
| As a note the same applies to logos. Very simple logos that
| are only some lines and shapes, do not have copyright (in
| usa)
| squokko wrote:
| Logos can still have trademark without having copyright
| as creativity is not a requirement of trademarks.
| leepowers wrote:
| It's possible the complaint is using a trivial example to
| illustrate the type of argument plaintiffs want to make during
| any trial. A 200-line example is too unwieldy for non-
| programmers to digest, especially given the formatting
| constraints of a legal brief.
|
| Look at paragraphs 90 and 91 on page 27 of the complaint[1]:
|
| "90. GitHub concedes that in ordinary use, Copilot will
| reproduce passages of code verbatim: "Our latest internal
| research shows that about 1% of the time, a suggestion [Output]
| may contain some code snippets longer than ~150 characters that
| matches" code from the training data. This standard is more
| limited than is necessary for copyright infringement. But even
| using GitHub's own metric and the most conservative possible
| criteria, Copilot has violated the DMCA at least tens of
| thousands of times."
|
| Does distributing licensed code without attribution on a mass
| scale count as fair use?
|
| If Copilot is inadvertently providing a programmer with
| copyrighted code, is that programmer and/or their employer
| responsible for copyright infringement?
|
| There's a lot of interesting legal complications I think the
| courts will want to adjudicate.
|
| [1]
| https://githubcopilotlitigation.com/pdf/1-0-github_complaint...
| schleck8 wrote:
| > Surely everyone reading this has written that code verbatim
| at some point in their lives
|
| Ironically their Twitter account uses a screenshot from a TV
| series as profile picture. I wonder how legal that is, even if
| meant as a joke.
|
| https://twitter.com/saverlawfirm
|
| Edit: It's been changed 2 minutes after I wrote this comment
| zeven7 wrote:
| This comment is 1 minute old and I only see a plain black
| profile picture.
|
| Or is your comment itself the joke?
| schleck8 wrote:
| They changed it, I'm 100 % sure. The profile picture was
| Saul from Breaking Bad. I assume they read the comments
| here and changed it in a matter of one or two minutes.
| hdjjhhvvhga wrote:
| Is there a Wayback Machine for Twitter?
| [deleted]
| nikanj wrote:
| This reminds me of the SCO vs Linux lawsuits.
| clusterhacks wrote:
| Did Microsoft use the source code of Windows (in whole or in
| part) as training input to Copilot?
| renewiltord wrote:
| It doesn't make sense. If I make a piece of software that curls a
| random gist and then puts it into your editor am I infringing or
| are you infringing when you run it or are you infringing when you
| use that file and distribute it somewhere?
| lbotos wrote:
| > If I make a piece of software that curls a random gist and
| then puts it into your editor am I infringing
|
| Depends on the license. If it's MIT and you serve the license,
| no, you are not infringing at all. A trimmed version of MIT for
| the relevant bits:
|
| Permission is hereby granted [...[ to any person obtaining a
| copy of this software [..] to use, copy, modify, merge,
| publish, distribute, sublicense, and/or sell copies of the
| Software, [...] subject to the following conditions:
|
| The above copyright notice and this permission notice shall be
| included in all copies or substantial portions of the Software.
|
| > are you infringing when you run it
|
| Depends on the license
|
| > are you infringing when you use that file and distribute it
| somewhere
|
| Depends on the license
|
| ----
|
| When copilot gives you code without the license, you can't even
| know!
| renewiltord wrote:
| Well, `curl` will download a gist without checking its
| license. So curl is infringing?
| deanjones wrote:
| This will fail very quickly. The licence that project owners
| publish with their code on Github applies to third parties who
| wish to use the code, but does not apply to Github. Authors who
| publish their code on Github grant Github a licence under the
| Github Terms: https://docs.github.com/en/site-policy/github-
| terms/github-t...
|
| Specifically, sections D.4 to D.7 grant Github the right to "to
| store, archive, parse, and display Your Content, and make
| incidental copies, as necessary to provide the Service, including
| improving the Service over time. This license includes the right
| to do things like copy it to our database and make backups; show
| it to you and other users; parse it into a search index or
| otherwise analyze it on our servers; share it with other users;
| and perform it, in case Your Content is something like music or
| video."
| mldq wrote:
| This is the standard content display license that everyone
| uses. Even in your quoted text I don't see any hint that
| snippets can be shown without attribution or the code license.
|
| It also says they can't sell the code, which CoPilot is doing.
|
| Also, in a very high number of cases it isn't the author who
| uploads.
|
| Repeating your line of argumentation (which occurs in every
| CoPilot thread) does not make it true.
| deanjones wrote:
| It's irrelevant whether it's standard or not. Again, the
| terms in the code licence (including attribution) do not
| apply to Github, because that is not the licence under which
| they are using the code. You grant them a separate licence
| when you start using their service.
|
| If someone who isn't the author has uploaded code which they
| do not have a right to copy, they are liable, not Github.
| This is also clear from the Github Terms: "If you're posting
| anything you did not create yourself or do not own the rights
| to, you agree that you are responsible for any Content you
| post"
|
| It's almost as if these highly paid lawyers know what they're
| doing.
| lpolk wrote:
| You grant them a content display license, not a general
| code license.
|
| > It's almost as if these highly paid lawyers know what
| they're doing.
|
| Sure, they wrote the content display license long before
| CoPilot even existed. Any court will see the intent and not
| interpret these terms as a code re-licensing.
| deanjones wrote:
| There is no such thing as a "content display licence" or
| "general code licence". There is copyright (literally,
| the right to make copies) which broadly lies with the
| author, who can then grant other parties a licence to
| copy their content.
|
| I'm afraid I do not believe your legal expertise is so
| extensive that you are able to accurately predict the
| judgement of "any court".
| xigoi wrote:
| > You grant them a separate licence when you start using
| their service.
|
| And that license explicitly states that it doesn't give
| them the right to sell your code.
| klabb3 wrote:
| > Authors who publish their code on Github grant Github a
| licence under the Github Terms:
| https://docs.github.com/en/site-policy/github-terms/github-t...
|
| This sounds unenforceable in the general case. How could github
| know whether someone pushes their own code or not? Is it a
| license violation to push someone's FOSS code to github because
| the author didn't sign up with GH?
| acdha wrote:
| I don't see that being "quickly" - they'd have to get a judge
| to agree that passing your code off without attribution for
| other people to use as their own work is a normal service
| improvement. Given that it's a separate feature with different
| billing terms, I'm skeptical that it's anywhere near the given
| that you're portraying it as.
| deanjones wrote:
| "Without attribution" is a condition of the licence that
| applies to third-parties. It is not a condition of the
| licence that applies to Github.
| TAForObvReasons wrote:
| It's worth reading the passage in its entirety and how a
| court would interpret it:
|
| > We need the legal right to do things like host Your
| Content, publish it, and share it
|
| > This license does not grant GitHub the right to sell Your
| Content. It also does not grant GitHub the right to
| otherwise distribute or use Your Content outside of our
| provision of the Service, except that as part of the right
| to archive Your Content, GitHub may permit our partners to
| store and archive Your Content in public repositories in
| connection with the GitHub Arctic Code Vault and GitHub
| Archive Program.
|
| If Copilot is straight-up reproducing work, and it is a
| service that users have to pay to use, then it seems like
| Copilot is "sell[ing] your content" and thus the license
| does not apply.
|
| More generally, a court is likely to look at the plain
| English summary and judge. Copilot is not an integral part
| of "the service" as developers understood it before Copilot
| existed.
| deanjones wrote:
| "as necessary to provide the Service, including improving
| the Service over time."
| lamontcg wrote:
| You're trying to play desperate semantic games.
|
| "This license does not grant GitHub the right to sell
| Your Content" is unambiguously clear.
| deanjones wrote:
| "desperate semantic games" is actually a reasonable
| description of the legal process :-)
|
| I'm not sure I agree that anything expressed in a legal
| contract using natural language is "unambiguously clear".
| MS / Gtihub's expensively-attired lawyers will not doubt
| forcefully argue that they are not selling the YOUR
| content, but a service based on a model generated from a
| large collection of content, which they have been granted
| a licence to "parse it into a search index or otherwise
| analyze it on our servers". There may even be in-court
| discussion of generalization, which will be exciting.
| sigzero wrote:
| If that is pretty much verbatim under their terms, then yes the
| lawsuit is going nowhere.
| nullc wrote:
| I think if this is successful it will be very bad for the open
| world.
|
| Large platforms like github will just stick blanket agreements
| into the TOS which grant them permission (and require you
| indemnify them for any third party code you submit). By doing so
| they'll gain a monopoly on comprehensively trained AI, and the
| open world that doesn't have the lever of a TOS will not at all
| be able to compete with that.
|
| Copilot has seemed to have some outright copying problems,
| presumably because its a bit over-fit. (perhaps to work at all it
| must be because its just failing to generalize enough at the
| current state of development) --- but I'm doubtful that this
| litigation could distinguish the outright copying from training
| in a way that doesn't substantially infringe any copyright
| protected right (e.g. where the AI learns the 'ideas' rather than
| verbatim reproducing their exact expressions).
|
| The same goes for many other initiatives around AI training
| material-- e.g. people not wanting their own pictures being used
| to train facial recognition. Litigating won't be able to stop it
| but it will be able to hand the few largest quasi-monopolisits
| like facebook, google, and microsoft a near monopoly over new AI
| tools when they're the only ones that can overcome the defaults
| set by legislation or litigation.
|
| It's particularly bad because the spectacular data requirements
| and training costs already create big centralization pressures in
| the control of the technology. We will not be better off if we
| amplify these pressures further with bad legal precedents.
| barelysapient wrote:
| MSFT to $0 anyone?
| EMIRELADERO wrote:
| I think it's a great time to explain why this won't hit AI art
| such as Stable Diffusion, even if GitHub loses this case.
|
| The crux of the lawsuit's argument is that the AI unlawfully
| _outputs copyrighted material_. This is evident in many tests
| with many people here and on Twitter even getting _verbatim
| comments_ out of it.
|
| AI art, in the other hand, is not capable of outputting the
| images from its training set, as it's not a collage-maker, but an
| artificial brain with a paintbrush and virtual hand.
| PuddleCheese wrote:
| These models can actually output images that can be extremely
| close to the material present in training models:
|
| - https://i.imgur.com/VikPFDT.png
|
| I also don't know if I would anthropomorphize ML to that
| degree. It's a poor metaphor and isn't really analogous to a
| human brain, especially considering our current understanding,
| or lack thereof, of the brain, and even the limited insight we
| have into how some of these models work from the people who
| work on them.
| jrochkind1 wrote:
| Eh... I don't know. It sounds to me like you are saying because
| the code example outputs _exact_ lines, it 's a copyright
| violation; but the image AI's necessarily don't output exact
| copies of even portions of pre-existing images, that's not how
| they work.
|
| But I don't think copyright on visual images actually works
| like that, that it needs to be an _exact_ copy to infringe.
|
| If I draw my own pictures of Mickey Mouse and Goofy having a
| tea party, it's still a copyright infringement if it is
| _substantially similar_ to copyright depictions of mickey mouse
| and goofy. (subject to fair use defenses; I 'm allowed to do
| what would otherwise have been a copyright infringement if it
| meets a fair use defense, which is also not cut and dry, but if
| it's, say, a parody it's likely to be fair use. There is
| probably a legal argument that Copilot is fair use.... the more
| money Github makes on it, the harder it is though, but making
| money off something is not relevant to whether it's a copyright
| violation in the first place, but is to fair use defense).
|
| (yes, it might also be a trademark infringement; but there's a
| reason Disney is so concerned with copyright on mickey
| expiring, and it's not that they think there's lots of money to
| be spent on selling copies of the specific Steamboat Willy
| movie...)
|
| > There is actually no percentage by which you must change an
| image to avoid copyright infringement. While some say that you
| have to change 10-30% of a copyrighted work to avoid
| infringement, that has been proven to be a myth. The standard
| is whether the artworks are "substantially similar," or a
| "substantial part" has been changed, which of course is
| subjective.
|
| https://www.epgdlaw.com/how-can-my-artwork-steer-clear-of-co...
|
| I think Stable Diffusion etc are quite capable of creating art
| that is "substantially similar" to pre-existing art.
| EMIRELADERO wrote:
| I believe fair use is the way to go then. SD would definitely
| be so, in my opinion.
| solomatov wrote:
| IMO, the case is exactly the same for copilot and generative
| models for images. That's why it's so important to have some
| precedent as a guide for future products.
|
| P.S. I am not a lawyer.
| kmnc wrote:
| I don't understand this argument... if image AI gets good
| enough then generating exact copies of its training model seems
| trivial.
| warbler73 wrote:
| It seems obvious that AI models are derivative works of the works
| they are trained on but it also seems obvious that it is totally
| legally untested whether they are derivative works in the formal
| legal sense of copyright law. So it should be a good case
| _assuming_ we have wise and enlightened judges who understand all
| nuances and can guide us into the future.
| elcomet wrote:
| This is why we can't have nice things. Copilot is the best thing
| that happened in developper tools since a long time, it increased
| a lot my productivity. Please don't ruin it.
| rafaelturk wrote:
| Like everything legally related: This is not about open source
| fairness, protecting innovation, it's all about making money.
| awestroke wrote:
| If this leads anywhere I'll be pissed. I love CoPilot.
| yamtaddle wrote:
| I expect I'd love it but I've been holding off until I find out
| whether MS lets devs on their core products use it.
|
| If not, it's a pretty clear sign they consider it radioactive.
| an1sotropy wrote:
| copilot is great, and ignorance is bliss, isn't it
|
| The situation that this lawsuit is trying to save you from is
| this: (1) copilot blurps out some code X that you use, and then
| redistribute in some form (monetized or not); (2) it turns out
| company C owns copyright on something Y that copilot was
| trained on, and then (3) C makes a strong case that X is part
| of Y, and that your use of X does not fall under "fair use",
| i.e. you infringed on the licensing terms that C set for Y.
|
| You are now in legal trouble, and copilot put you there,
| because it never warned that you X is part of Y, and that Y
| comes with such and such licensing terms.
|
| Whether we like copilot or not, we should be grateful that this
| case is seeking to clarify some things are currently legally
| untested. Microsoft's assertions may muddy the waters, but that
| doesn't make law.
| foooobaba wrote:
| It seems like we should come to agreement on what the license is
| intended for, given that when the licenses were created in a time
| before AI like this existed. If the authors did not intend their
| code to be used like this, should we not respect it? Also, does
| it make sense to create new licenses which explicitly state
| whether using it for AI training is acceptable or not - or are
| our current licenses good enough?
| herpderperator wrote:
| The title of the submitted PDF document: "Microsoft Word -
| 2022-11-02 Copilot Complaint (near final)"[0]
|
| I've noticed this a lot and it's quite funny seeing what the
| actual filename of the document was. Does this just get included
| as metadata by default when you export to PDF?
|
| [0]
| https://githubcopilotlitigation.com/pdf/1-0-github_complaint...
| mirekrusin wrote:
| They should use github instead of sending "(final, 2nd
| revision, really final, amended)" emails.
| D13Fd wrote:
| If only you could, with Word docs. Sadly you can't in any
| meaningful way.
| tasuki wrote:
| The typography on that document is not great. Perhaps they
| should read Matthew Butterick's book?
| senkora wrote:
| It does, yes. It's very annoying and I have occasionally
| stripped it off of PDFs I've made, using exiftool.
| bombcar wrote:
| In word you can go to document properties or whatever and set
| the Title and some other fields to control what gets into the
| PDF.
| SurgeArrest wrote:
| I hope this case will fail and establish a good precedent for all
| future AI litigations and may be even prevent new ones. Your code
| is open source - irregardless of license, one might read it as a
| text book and then remember or even copy snippets and re-use this
| somewhere else unrelated to the original application. If you
| don't like this, don't make your code open source. This was
| happening and is happening independent of any license all over
| the world by majority of developers. What Copilot and similar
| tools did was to make those snippets accessible for extrapolation
| in new applications.
|
| If these folks win - we again throw progress under the bus.
| humanwhosits wrote:
| > irregardless of license
|
| Hard no. Please stop using open source code if this is how you
| think of it.
|
| Without licenses being respected, we don't get open source
| communities.
| vesinisa wrote:
| Open source does not mean public domain. Open source
| specifically attaches limitations on how the code may be
| reused.
| elcomet wrote:
| There are no limitations on reading the code to learn from
| it.
| MontagFTB wrote:
| Perhaps the lawsuit contends that Copilot isn't in fact
| learning how to code, but is rather regurgitating
| information it has managed to glean and statistically
| categorize, without any real understanding as to what it
| was doing?
| simion314 wrote:
| > Your code is open source ....
|
| So why MS can screw only with some licenses that you call "open
| source". Your example with a human reading a book would also
| work with code available licenses or decompiled binaries.
|
| I would have been fine if the open source code was used to
| create an open model or if MS would have put his ass on the
| line and also train the model with all the GitHub code because
| they claim there is no copyright issue.
| tfsh wrote:
| If organisations are going to ignore the licenses attached to
| my OOS and that's legimitised in the law, then that's a
| surefire way to irreparably damage the open source ecosystem
| solomatov wrote:
| The problem is that copyright laws were introduced for a
| reason, and with a thinking similar to yours we might decide to
| get rid of copyright altogether, which I think is a bad idea.
|
| P.S. I am not a lawyer.
| [deleted]
| Etheryte wrote:
| > Your code is open source - irregardless of license, one might
| read it as a text book and then remember or even copy snippets
| and re-use this somewhere else unrelated to the original
| application.
|
| Yes, but attribution should still be given. Just because you
| don't copy-paste someone else's creation doesn't mean you're
| licensed to use it.
| shagie wrote:
| Is it the role of the tool (in this case copilot) to include
| the license information? Or is it the responsibility of the
| organization using the code to make sure that it wasn't
| copied from somewhere?
|
| What if, instead of a tool, you had a random consultant do
| some work, and it was found out that he asked a ton of stuff
| on Stack Overflow and copied the CC-BY-SA 4.0 answers into
| his work? What if it was then found out that one of _those_
| answers was based on copying something from the Linux kernel?
| Who is responsible for doing the license check on the code
| before releasing the product?
| alpaca128 wrote:
| > Or is it the responsibility of the organization using the
| code to make sure that it wasn't copied from somewhere?
|
| Do you know whether the code you got from Copilot has an
| incompatible license? No, so if you plan to use Copilot for
| serious projects you need it to include sources/licenses
| either way. In fact that would be a very helpful feature as
| it would let you filter licenses.
| jacooper wrote:
| No thank you. I put a license to be followed, not to just be
| disregarded by an AI as "Learning material". No human perfectly
| reproduces their learning material no matter what, but Copilot
| does.
| mcluck wrote:
| You mean to tell me that no one has ever perfectly replicated
| an example that they read somewhere? There's only so many
| ways to write AABB collision, fibonacci, or any number of
| other common algorithms. I'm not saying there aren't things
| to consider but I'm sure I've perfectly replicated something
| I read somewhere whether I'm actively aware of it or not
| IshKebab wrote:
| So are you ok with it being illegal for humans to learn from
| copyrighted books unless they have a license that explicitly
| allows learning? That does not sound like a pleasant
| consequence.
| alpaca128 wrote:
| Would you use an AI text generator to write a thesis? No,
| there's a risk a whole chunk of it will be considered
| plagiarism because you have no idea what the source of the
| AI output is, but you know it was trained with unknown
| copyrighted material. This has nothing to do with the way
| humans learn, it's about correct attribution.
|
| There is no technical reason why Microsoft can't respect
| licenses with Copilot. But that would mean more work and
| less training input, so they do code laundering and excuse
| it with comparisons to human learning because making AI
| seem more advanced than it is has always worked well in
| marketing.
|
| Edit: And where do you draw the line between "learning" and
| copying? I can train a network to exactly reproduce
| licensed code (or books, or movies) just like a human can
| memorize it given enough time - and both of those would be
| considered a copyright violation if used without correct
| attribution. If you trained an AI model with copyrighted
| data you will get copyrighted results with random variation
| which might be enough to become unrecognizable if you're
| lucky.
| codyb wrote:
| I doubt it, but they'd probably be against people quoting
| copyrighted material verbatim without attribution in their
| own work after.
| Veen wrote:
| It's a pleasant consequence for the person who spent years
| becoming an expert and then writing the book. It's also a
| pleasant consequence for the people who buy the book, which
| might not have existed without a copyright system to
| protect the writer's interests.
| MontagFTB wrote:
| I think they're taking issue with the unauthorized
| duplication of copyrighted code. That's distinct from
| learning how to code (which I don't think anyone would
| claim Copilot is doing) which people get from reading a
| book. If you were to read the book only to copy it verbatim
| and resell it, you're going to have a bad time.
| test098 wrote:
| Here's the thing - the US has well-established laws around
| copyright that don't consider learning from books a
| violation of those copyrights. This lawsuit is intended to
| challenge Copilot as a violation of licensing and isn't a
| litigation of "how people learn." Your program stole my
| code in violation of my license - there's a clear legal
| issue here.
|
| I'd pose a question to you - would it be okay for me to
| copy/paste your code verbatim into my paid product in
| violation of your license and claim that I'm just using it
| for "learning"?
| bun_at_work wrote:
| AI are not humans, no human can read _all_ the code on
| Github. They certainly can't read _all_ the code on Github
| at the scale that MS can, and are unlikely to be able to
| extract profits directly from that code, in violation of
| the licensing.
| celestialcheese wrote:
| Maybe I'm being too cynical, but this feels like it's more a law
| firm and individual looking to profit and make their mark in
| legal history rather than an aggrieved individual looking for
| justice.
|
| Programmer/Lawyer Plaintiff + upstart SF Based Law Firm + novel
| technology = a good shot at a case that'll last a long time, and
| fertile ground to establish yourself as experts in what looks to
| be a heavily litigated area over the next decade+.
| squokko wrote:
| Just like good people can try to do good things and end up
| screwing things up badly, bad people can do bad things that
| have positive effects.
| efitz wrote:
| I fail to see the positive effect here.
|
| Just like Google's noble but misguided attempt to make all
| the world's books searchable a few years back, what we have
| here is IP law getting in the way of a societal goodness.
|
| Copyright and patent are not natural; they're granted by law
| "to promote progress in the useful arts". At first glance
| here it appears that GitHub is promoting progress and the
| plaintiffs are just rent-seeking.
| undoware wrote:
| If it wasn't Butterick I wouldn't be interested.
|
| But I write this to you in Hermes Maia
| jedberg wrote:
| As my lawyer friend told me, a class action lawsuit is a
| lawyer's startup. A lot of work for little pay with the chance
| of a huge payout.
| dkjaudyeqooe wrote:
| But who cares? Who else is willing to fund litigation on this
| important legal question? The real justice here is declarative
| and benefits everyone.
|
| No matter who litigates and for what reasons it will be
| extremely valuable for good precedents to be set around the
| question of things like Copilot and DALL-E with respect to
| copyright and ownership. I'd rather have self interested
| lawyers dedicated to winning their case than self interested
| corporations fighting this out.
| sam345 wrote:
| yes, of course that's what it is. plaintiffs if they win will
| get a few pennies, lawyers will get a lot.
| AuryGlenz wrote:
| I brought a class action suit against Sharp and I was the class
| representative. They settled. The judge awarded me a whopping
| $1,000 from the settlement money. From the time I put into it,
| including 3 or 4 full days in NYC because my deposition
| coincided with a snowstorm, I didn't exactly come out ahead
| financially.
|
| Obviously this is different for the reasons you stated, but I
| didn't want people to think bringing a class action lawsuit
| forward is a way to get rich. It's a bit of a joke, really.
| varispeed wrote:
| > rather than an aggrieved individual looking for justice.
|
| How an aggravated individual can seek justice from a big
| multinational corporation? That's not possible unless that
| individual is a retired billionaire wanting to become a
| millionaire.
| grogenaut wrote:
| I have a friend from highschool who does class action lawsuits.
| He spends a very large amount of money funding his suits on
| things like expert witnesses among other things, only 1 in 5
| pays off, so it has to pay off well. His model is similar to
| venture capitalism. Most of these cases take 5-7 years to
| execute. So he basically takes out loans from another laywer to
| fund them. His average pay for the last 10 years has been
| around $140k/year. Some years he makes nothing and pays out a
| lot, others he makes several million and pays back all the
| loans. Another way to think of it is like giving money to tax
| fraud wistleblowers.
|
| Yes he does think of it somewhat like that, establishing
| himself in an area. However a lot of his work comes from
| finding people aggrieved by something not them finding him.
| [deleted]
| iudqnolq wrote:
| One of the core principles of the American system of government
| is that we outsource enforcement to private parties. Instead of
| the public needing to fund enforcement with tax dollars private
| parties undertake risky litigation in exchange for the chance
| of a big payoff.
|
| There is a reasonable argument that's a horrible system. But it
| doesn't make sense to criticize the plaintiff looking for a
| profit - the entire system has been set up such that that's
| what they're supposed to do. If you're angry about it lobby for
| either no rules or properly funded government enforcement of
| rules.
| thaumasiotes wrote:
| > If you're angry about it lobby for either no rules or
| properly funded government enforcement of rules.
|
| No, there are plenty of other changes you might want to see.
|
| For example, in the American system, judges are generally not
| allowed to be aware of anything not mentioned by a party to
| the case. There is no good reason for this.
| onlycoffee wrote:
| It's the two words, "government enforcement", that bothers
| me. If your party is in control the words sound fine,
| otherwise, they sound ominous.
| nicoburns wrote:
| Are you against policing? Because that's government
| enforcement. Admittedly policing in the US is god awful,
| but I still think most people would rather have it than no
| police force at all.
|
| Government enforcement of this kind of law is really no
| different. It wouldn't be the legislature doing it.
| falcolas wrote:
| In an ideal situation, the enforcement would be managed by
| boring employees who don't much care who's in power, since
| they're not appointed.
|
| AKA a vast majority of the non-legislative government
| workers.
| celestialcheese wrote:
| That's entirely fair - and I'm not angry, just not convinced
| in their arguments, especially when the motive is likely not
| genuine.
|
| As an aside - I'm almost positive MSFT/Github expected this
| and their legal teams have been prepping for this moment.
| Copyright Law and Fair Use in the US is so nuanced and vague
| that anything created involving prior art by big-pocket
| individuals or corporations will be litigated swiftly.
|
| I expected one of these lawsuits to come first from Getty or
| one of the big money artist estates against OpenAI or
| Stability.ai, but Getty and OpenAI seem to be partnering
| instead of litigating.
| cube00 wrote:
| Sounds like healthcare
| lovich wrote:
| > But it doesn't make sense to criticize the plaintiff
| looking for a profit...
|
| I don't know man, I can simultaneously see the systemic issue
| that needs to be solved and also critique someone for
| subcoming to base needs like greed when they don't have the
| need.
| CobrastanJorji wrote:
| What they're doing is a service, though. Say that $10
| million worth of damage against others has been done. If
| the law firm does not act, the villainous curs who caused
| that damage get to keep their money and are incentivized to
| do it again. If the law firm does act and prevails, then
| the villains lose their ill-gotten gains (in favor of the
| law firm and, sometimes, to an extent, the injured
| parties). That's preferable. Not ideal, but certainly
| better than nothing.
| lovich wrote:
| That implies it's a service I want, which I have not
| decided on in this situation. Either way I was more
| arguing with the other posters claim that it "didn't make
| sense" to critique this move, which I think is factually
| incorrect since I can come up with a few plausible
| situations where it does make sense
| ImPostingOnHN wrote:
| it doesn't imply it's a service you want, but rather, if
| you do want it, you can opt-in to the service by joining
| the lawsuit when the time comes
|
| if you feel the class doesn't represent you, you can just
| not opt-in
| lovich wrote:
| I perhaps wasn't clear, I meant that I am not sure I want
| copilot constrained in this way. If I solidify that
| belief into definitely not wanting copilot constrained,
| then this would be a negative suit for me
| ssteeper wrote:
| Is a startup founder looking for a big payout succumbing to
| greed?
|
| These people are just following incentives.
| lovich wrote:
| People following financial incentives are being greedy,
| this is how we got "greed is good" as a phrase
| MikePlacid wrote:
| But the need is obviously there. Everyone who produces the
| following code in a non-university environment - for a fee!
| - _needs_ to be punished quickly and severely:
|
| _Based on the given prompt, [Codex] produced the following
| response: function isEven(n) {
| if (n == 0) return true;
| else if (n == 1) return false;
| else if (n < 0) return isEven(-n);
| else return isEven(n - 2);
| } console.log(isEven(50)); // -
| true console.log(isEven(75));
| // - false console.log(isEven(-1));
| // - ??**
|
| _
| glerk wrote:
| Correct. This is no different than patent trolls weaponizing
| the justice system for personal gain. Nothing they claim or do
| is in good faith and they should be treated as bad actors.
| jacobr1 wrote:
| It is a little different. The first patent troll that blazed
| the trail gets both more credit (for ingenuity) and blame
| (for the deleterious impact) in my opinion. I'll give the
| same internet points to these guys.
| vesinisa wrote:
| How come? When people contributed code publicly they attached
| a license how the code may be used. Is training an AI model
| on this allowed? I think there's a fair, important and novel
| legal question to be examined here.
|
| Patent trolls usually file lawsuits that are just unmerited,
| but rely simply on the fact that mounting a defence is more
| expensive than settling.
| henryfjordan wrote:
| It can be and is both what you describe and a necessary feature
| of our adversarial legal system.
|
| Github can't really go to a court by themselves and ask "is
| this legal?". There is the concept declaratory relief but you
| need to be at least threatened with a lawsuit before that's on
| the table.
|
| So Github kinda just has to try releasing CoPilot and get sued
| to find out. The legal system is setup to reward the lawyer who
| will go to bat against them to find out if it is legal. The
| plantiff (and maybe lawyer, depending on how the case is
| financed) take the risk they are wrong just as Github had to.
|
| It is setup this way to incentivize lawyers to protect
| everyone's rights.
| heavyset_go wrote:
| This is a classic example of the ad hominem fallacy. Stating
| that "they are no angels" doesn't detract from whether they're
| right or capable of effecting positive legal change.
|
| Frankly, I don't care if anyone makes a name for themselves for
| doing this. In fact, I applaud them and would happily give them
| recognition should they be successful.
|
| Similarly, I'd hope that there are opportunties for profit in
| this space, given that I don't want cheap lawyers botching this
| case and setting terrible legal precedent for the rest of us.
| Microsoft has a billion dollar legal team and they will do
| everything they can to protect their bottom line.
| Cort3z wrote:
| I'm not a lawyer, but here is why I believe a class action
| lawsuit is correct;
|
| "AI" is just fancy speak for "complex math program". If I make a
| program that's simply given an arbitrary input then, thought math
| operations, outputs Microsoft copyright code, am I in the clear
| just because it's "AI"? I think they would sue the heck out of me
| if I did that, and I believe the opposite should be true as well.
|
| I'm sure my own open source code is in that thing. I did not see
| any attributions, thus they break the fundamentals of open
| source.
|
| In the spirit of Rick Sanchez; It's just compression with extra
| steps.
| njharman wrote:
| Say you read a bunch of code, say over years of developer
| career. What you write is influenced by all that. Will include
| similar patterns, similar code and identical snippets,
| knowingly or not. How large does snippet have to be before it's
| copyright? "x"? "x==1"? "if x==1\n print('x is one')"?
| [obviously, replace with actual common code like if not found
| return 404].
|
| Do you want to be vulnerable to copyright litigation for code
| you write? Can you afford to respond to every lawsuit filed by
| disgruntled wingbat, large corp wanting to shut down open
| source / competing project?
| rowanG077 wrote:
| The brain is also just a "complex math program". Since math is
| just the language we use to describe the world. I don't feel
| this argument has any weight at all.
| Supermancho wrote:
| > The brain is also just a "complex math program".
|
| This is not a fact.
| rowanG077 wrote:
| Explain yourself. There is not a understood natural
| phenomenon which we could not capture in math. If you argue
| behavior of the brain cannot be modeled using a complex
| math program you are claiming the brain is qualitative
| different then any mechanism known to man since the dawn of
| time.
|
| The physics that gives rise to the brain is pretty much
| known. We can model all the protons, electrons and photons
| incredibly accurately. It's an extraordinary claim you say
| the brain doesn't function according to these known
| mechanisms.
| moralestapia wrote:
| >Explain yourself.
|
| Why? Burden of proof is on you.
| heavyset_go wrote:
| > _We can model all the protons, electrons and photons
| incredibly accurately._
|
| We can't even accurately model a receptor protein on a
| cell or the binding of its ligands, nor can we accurately
| simulate a single neuron.
|
| This is one of those hard problems in computing and
| medicine. It is very much an open question about how or
| if we can model complex biology accurately like that.
| rowanG077 wrote:
| I didn't say we can simulate it. There is a massive leap
| from what I said to being able to simulate it.
| Supermancho wrote:
| > There is not a understood natural phenomenon which we
| could not capture in math.
|
| This is a belief about our ability to construct models,
| not a fact. Models are leaky abstractions, by nature.
| Models using models are exponentially leaky.
|
| > I didn't say we can simulate it.
|
| Mathematics (at large) is descriptive. We describe matter
| mathematically, as it's convenient to make predictions
| with a shared modeling of the world, but the quantum of
| matter is not an equation. f() at any scale of
| complexity, does not transmute.
| CogitoCogito wrote:
| > There is not a understood natural phenomenon which we
| could not capture in math.
|
| Does the brain fall in into the category of "understood
| natural phenomenon"? Is it "understood"? What does
| "understood" mean in this context?
| layer8 wrote:
| You are confusing the nondiscrete math of physics with
| the discrete math of computation. Even with unlimited
| computational resources, we can't simulate arbitrary
| physical systems exactly, or even with limited error
| bounds. What a program (mathematical or not) in the
| turing-machine sense can do is only a tiny, tiny subset
| of what physics can do.
|
| Personally I believe it's likely that the brain can be
| reduced to a computation, but we have no proof of that.
| bqmjjx0kac wrote:
| > There is not a understood natural phenomenon which we
| could not capture in math.
|
| If all you have is a hammer...
|
| The nature of consciousness is an open question. We don't
| know whether the brain is equivalent to a Turing machine.
| lisper wrote:
| Somewhere in the complex math is the origin of whatever it is
| in intellectual property that we deem worthy of protection.
| Because we are humans, we take the complex math done by human
| brains as worthy of protection _by fiat_. When a painter
| paints a tree, we assign the property interest in the
| painting to the human painter, not the tree, notwithstanding
| that the tree made an essential contribution to the content.
| The _whole point_ is to protect the interests of humans (to
| give them an incentive to work). There is no other reason to
| even entertain the _concept_ of "property".
| rowanG077 wrote:
| Creations by AI should obviously be protected by fiat as
| well. Anything else is a ridiculous double standard that
| will stifle progress.
| kadoban wrote:
| The legal world tends to be less interested in these kind of
| logical gotchas than engineering types would like. I don't
| see a judge caring about that brain framing at all.
|
| Not to mention, if your brain starts outputting Microsoft
| copyright code, they're going to sue the shit out of you and
| win, so I'm not sure how that would help even so.
| yoyohello13 wrote:
| So if I read the windows explorer source code, then later
| produced a line for line copy (without referring back to the
| source). Microsoft couldn't sue me?
| bombolo wrote:
| > The brain is also just a "complex math program"
|
| Source?
| rowanG077 wrote:
| The physics that gives rise to the brain is pretty much
| known. We can model all the protons, electrons and photons
| incredibly accurately.
| iampuero wrote:
| I feel like this is a massive oversimplification...
|
| In this answer, you're completely ignoring the massive
| fact that we cannot create a human brain. Having
| mathematical models about particles does not mean we have
| "solved" the brain. Unless you're also believe that these
| LLMs are actually behaving just like human brains, in
| that have consciousness, they have logic, they dream,
| they have nightmares, they produce emotions such as fear,
| love, anger, that they grow and change over time, that
| they controls body, your lungs, heart, etc...
|
| You see my point, right? Surely you see that the
| statement 'The brain is also just a "complex math
| program"' is at best extremely over-simplistic.
| bqmjjx0kac wrote:
| > The physics that gives rise to the brain is pretty much
| known
|
| There is a gaping chasm between observing known physics,
| and saying it is the _cause_ of consciousness.
|
| You should read this:
| https://en.wikipedia.org/wiki/Philosophy_of_mind
|
| [ Edit: better link: https://en.wikipedia.org/wiki/Hard_p
| roblem_of_consciousness ]
| fsflover wrote:
| It might be. If your brain generated verbatim someone's code
| without following its license, you would also break
| copyright, wouldn't you?
| kyruzic wrote:
| No it's actually not.
| ugh123 wrote:
| Attributions are fundamental to open source? I thought having
| source openly available was fundamental to open source (and
| allowed use without liability/warranty) as per apache, mit, and
| other licenses.
|
| If they just stick to using permissive-licensed source code
| then i'm not sure what the actual 'harm' is with co-pilot.
|
| If they auto-generate an acknowledgement file for all source
| repos used in co-pilot, and then asked clients of co-pilot to
| ship that file with their product, would that be enough? Call
| it "The Extended Github Co-Pilot Derivative Use License" or
| something.
| heavyset_go wrote:
| Attribution and inclusion of copies of licenses are
| stipulations in almost all of the popular open source
| licenses, including BSD and MIT licenses.
| Cort3z wrote:
| People would likely not share any code if they could not
| trust that their work would be respected, and attributed. So
| yes, I believe it to be fundamental to open source.
| Aeolun wrote:
| Maybe researchers that are used to hunting for publications
| and attributions.
|
| If I'm sharing my code publicly, it's because I want it to
| be _used_.
| TAForObvReasons wrote:
| Attributions are fundamental to permissive licenses as well.
| It's worth reading the licenses in question. MIT:
|
| > The above copyright notice and this permission notice shall
| be included in all copies or substantial portions of the
| Software.
|
| This is the "attribution" requirement that even a Copilot
| trained on only-MIT code would miss.
|
| If it were just about sharing code, there are public domain
| declarations and variants like CC0 licenses
| neongreen wrote:
| Apparently they are using GPL-licensed code as well, see
| https://twitter.com/DocSparse/status/1581461734665367554
|
| After five minutes of googling I'm still not sure if using
| MIT code requires an attribution, but many people claim it
| does, see https://opensource.stackexchange.com/a/8163 as one
| example
| xigoi wrote:
| From GitHub itself (emphasis mine):
|
| > A short and simple permissive license with conditions
| only _requiring preservation of copyright and license
| notices_. Licensed works, modifications, and larger works
| may be distributed under different terms and without source
| code.
| drvortex wrote:
| Your code is not in that thing. That thing has merely read your
| code and adjusted its own generative code.
|
| It is not directly using your code any more than programmers
| are using print statements. A book can be copyrighted, the
| vocabulary of language cannot. A particular program can be
| copyrighted, but snippets of it cannot, especially when they
| are used in a different context.
|
| And that is why this lawsuit is dead on arrival.
| Cort3z wrote:
| Just to be clear; I cannot prove that they have used my code,
| but for the sake of argument, lets assume so.
|
| They would have directly used my code when they trained the
| thing. I see it as an equivalent of creating a zip-file. My
| code is not directly in the zip file either. Only by the act
| of un-zipping does it come back, which requires a sequence of
| math-steps.
| andrewmcwatters wrote:
| This is demonstrably false. It is a system outputting
| character-for-character repository code.[1]
|
| [1]: https://news.ycombinator.com/item?id=33457517
| Aeolun wrote:
| Ok, cool. Presumably that is because it's smart enough to
| know that there is only one (public) solution to the
| constraints you set (like asking it to reproduce licensed
| code).
|
| Now, while you may be able to get it to reproduce one
| function. One file, and definitely the whole repository
| seems extremely unlikely.
| naikrovek wrote:
| xigoi wrote:
| Individual words can't be copyrighted.
| adriand wrote:
| If I use Photoshop to create an image that is identical to
| a registered trademark, is the rights violation my fault or
| Adobe's fault?
| xigoi wrote:
| Photoshop can't produce copyrighted images on its own.
| metadat wrote:
| To play devil's advocate: Co-Pilot can't reproduce
| copyrighted work without appropriate user input.
|
| Just trying to demonstrate a point- this analogy seems
| flawed.
| heavyset_go wrote:
| If I draw some eyes in Photoshop, it won't automatically
| draw the Mona Lisa around it for me.
| metadat wrote:
| Until you sprinkle a bit of Stable Diffusion V2 or 3 on
| it..
| kyruzic wrote:
| No because that's not a trademark violation in anyway.
| Using GPL code in a non GPL project is a violation of
| copyright law though.
| pmarreck wrote:
| It can be modified to not do that (example: mutating the
| code to a "synonym" that is functionally but not visually
| identical).
|
| It can also be modified to be opt-in-only (only peoples'
| code that they permit to be learned on, can use the
| product)
| falcolas wrote:
| Perhaps you are right, and it could be so modified.
|
| Could be, but isn't. And that matters.
| lamontcg wrote:
| > but snippets of it cannot
|
| Yeah they can, and the whole functions that Copilot spits out
| are quite obviously covered by copyright.
|
| > especially when they are used in a different context.
|
| That doesn't matter.
| heavyset_go wrote:
| Neutral nets can and do encode and compress the information
| they're trained on, and can regurgitate it given the right
| inputs. It is very likely that someone's code is in that
| neural net, encoded/compressed/however you want to look at
| it, which Copilot doesn't have a license to distribute.
|
| You can easily see this happen, the regurgitation of training
| data, in an over fitted neural net.
| naikrovek wrote:
| > which Copilot doesn't have a license to distribute
|
| when you upload code to a public repository on github.com,
| you necessarily grant GitHub the right to host that code
| and serve it to other users. the methods used for serving
| are not specified. This is above and beyond the license
| specified by the license you choose for your own code.
|
| you also necessarily grant other GitHub users the right to
| view this code, if the code is in a public repository.
| eropple wrote:
| Host _that_ code. Serve _that_ code to other users. It
| does not grant the right to create _derivative works of
| that code_ outside the purview of the code 's license.
| That would be a non-starter in practice; see every
| repository with GPL code not written by the repository
| creator.
|
| Whether the results of these programs is somehow Not A
| Derivative Work is the question at hand here, not
| "sharing". I think (and I hope) that the answer to that
| question won't go the way the AI folks want it to go; the
| amount of circumlocution needed to excuse that the _not
| actually thinking and perceiving program_ is deriving
| data changes from its copyright-protected inputs is a
| tell that the folks pushing it know it 's silly.
| naikrovek wrote:
| copilot isn't creating derivative works: copilot users
| are.
|
| the human at the keyboard is responsible for what goes
| into the source code being written.
|
| to aid copilot users here, they are creating tools to
| give users more info about the code they are seeing:
| https://github.blog/2022-11-01-preview-referencing-
| public-co...
| heavyset_go wrote:
| It's served under the terms of my licenses when viewed on
| GitHub. Both attribution and licenses are shared.
|
| This is like saying GitHub is free to do whatever they
| want with copyrighted code that's uploaded to their
| servers, even use it for profit while violating its
| licenses. According to this logic, Microsoft can
| distribute software products based on GPL code to users
| without making the source available to them in violation
| of the terms of the GPL. Given that Linux is hosted on
| GitHub, this logic would say that Microsoft is free to
| base their next version of Windows on Linux without
| adhering to the GPL and making their source code
| available to users, which is clearly a violation of the
| GPL. Copilot doing the same is no different.
| CuriouslyC wrote:
| This is not necessarily true, the function space defined by
| the hidden layers might not contain an exact duplicate of
| the original training input for all (or even most) of the
| training inputs. Things that are very well represented in
| the training data probably have a point in the function
| space that is "lossy compression" level close to the
| original training image though, not so much in terms of
| fidelity as in changes to minor details.
| heavyset_go wrote:
| When I say encoded or compressed, I do not mean verbatim
| copies. That can happen, but I wouldn't say it's likely
| for every piece of training data Copilot was trained on.
|
| Pieces of that data are encoded/compressed/transformed,
| and given the right incantation, a neutral net can put
| them together to produce a piece of code that is
| substantially the same as the code it was trained on.
| Obviously not for every piece of code it was trained on,
| but there's enough to see this effect in action.
| xtracto wrote:
| Say you publish a song and copyright it. Then I record it and
| save it in a .xz format. It's not an MP3, it is not an audio
| file. Say I _split it_ into N several chunks and I share it
| with N different people. Or with the same people, but I share
| it at N different dates. Say I charge them $10 a month for
| doing that, and I don 't pay you anything.
|
| Am I violating your copyright? Are you entitled to do that?
|
| To make it funnier: Say instead of the .xz, I "compress" it
| via p compression [1]. So what I share with you is a pair of
| p indices and data lengths for each of them, from which you
| can "reconstruct" the audio. Am I illegally violating your
| copyrights by sharing that?
|
| [1] https://github.com/philipl/pifs
| 2muchcoffeeman wrote:
| I was thinking of something similar as a counter argument
| and lo and behold, it's a real thing maths has solved with
| a real implementation.
| Aeolun wrote:
| What you are actually giving people is a set of chords that
| happen to show up in your song, the machine can suggest an
| appropriate next chord.
|
| It's also smart enough to rebuild your song from the chords
| _if you ask it to_.
| varajelle wrote:
| I take your code and I compress it in a tar.gz file. Il
| call that file "the model". Then I ask an algorithm
| (Gzip) to infer some code using "the model". The
| algorithm (gzip) just learned how to code by reading your
| code. It just happened to have it memorized in its model.
| moralestapia wrote:
| Whatever you say man :^)
|
| https://twitter.com/docsparse/status/1581461734665367554
| klabb3 wrote:
| > Your code is not in that thing. That thing has merely read
| your code and adjusted its own generative code.
|
| This is kinda smug, because it overcomplicates things for no
| reason, and only serves as a faux technocentric strawman. It
| just muddies the waters for a sane discussion of the topic,
| which people can participate in without a CS degree.
|
| The AI models of today are very simple to explain: its a
| product built from code (already regulated, produced by the
| implementors) and source data (usually works that are
| protected by copyright and produced by other people). It
| would be a different product if it didn't have used the
| training data.
|
| The fact that some outputs are similar enough to source data
| is circumstantial, and not important other than for small
| snippets. The elephant in the room is the _act of using_
| source data to produce the product, and whether the right to
| decide that lies with the (already copyright protected)
| creator or not. That 's not something to dismiss.
| [deleted]
| NicoleJO wrote:
| You're wrong. See exposed code.
| https://justoutsourcing.blogspot.com/2022/03/gpts-
| plagiarism...
| smoldesu wrote:
| > "AI" is just fancy speak for "complex math program"
|
| Not really? It's less about arithmetic and more about
| inferencing data in higher dimensions than we can understand.
| Comparing it to traditional computation is a trap, same as
| treating it like a human mind. They've very different, under
| the surface.
|
| IMO, if this is a data problem then we should treat it like
| one. Simple fix - find a legal basis for which licenses are
| permissive enough to allow for ML training, and train your
| models on that. The problem here isn't developers crying out in
| fear of being replaced by robots, it's more that the code that
| it _is_ reproducing is not licensed for reproduction (and the
| AI doesn 't know that). People who can prove that proprietary
| code made it into Copilot deserve a settlement. Schlubs like me
| who upload my dotfiles under BSD don't fall under the same
| umbrella, at least the way I see it.
| Cort3z wrote:
| Who decides what constitutes an "AI program" vs just a
| "program"? What heuristic do we look at? At the end of the
| day, they have an equivalent of a .exe which runs, and
| outputs code that has a license attached to it.
| heavyset_go wrote:
| I've been saying AI is computational statistics on steroids
| for a while, and I think that's an apt generalization of what
| ML is.
| 2muchcoffeeman wrote:
| But it all runs on hardware we created and we know exactly
| what operations were implemented in that hardware. How is it
| not just math?
| sigzero wrote:
| > I'm not a lawyer, but
|
| Should have stopped there.
| Cort3z wrote:
| Why?
| sigzero wrote:
| Dang it. I was coming back to delete that comment. It was a
| stupid one.
| operatingthetan wrote:
| This is not a thread for lawyers to discuss only.
| benlivengood wrote:
| Humans are just compression with extra steps by that logic.
|
| There's a fairly simple technical fix for codex/copilot anyway;
| stick a search engine on the back end and index the training
| data and don't output things found in the search engine.
| cdrini wrote:
| I haven't heard anyone saying that copilot is legal "just
| because it's AI." That's a pretty bad faith, reductive, and
| disingenuous representation. The core argument I've seen is
| that the output is sufficiently transformative and not straight
| up copying.
| spiralpolitik wrote:
| At this point we are back in the territory that the idea and
| the expression of the idea are inseparable, therefore the
| conclusion will be that copyright protection does not apply to
| code.
|
| Personally I think this has the potential to blow up in
| everyones faces.
| pevey wrote:
| If it does end up that way, I feel like the trickle away from
| github will become a stampede. And that would be unfortunate.
| Having such a good hub for sharing and learning code is
| useful, but only if licenses are respected. If not, people
| will just hunker down and treat code like the Coke secret
| recipe. That benefits no one.
| VoodooJuJu wrote:
| As celestialcheese says [1], it seems like a manufactured case
| for the purpose of furthering someone's legal career rather than
| seeking remittance for any violations made by Copilot.
|
| But I like to put on my conspiracy hat from time to time, and
| right now is one such time, so let's begin...
|
| Though the motivations behind this case are uncertain, what is
| certain is that this case will establish a precedent. As we know,
| precedents are very important for any further rulings on cases of
| a similar nature.
|
| Could it be the case that Microsoft has a hand in this, in trying
| to preempt a precedent that favors Copilot in any further
| litigation against it?
|
| Wouldn't put it past a company like Microsoft.
|
| Just a wild thought I had.
|
| [1] https://news.ycombinator.com/item?id=33457826
| [deleted]
| [deleted]
| 60secs wrote:
| This is why we can't have nice dystopias.
| [deleted]
| fancyfredbot wrote:
| If a software developer learns how to code better by reading GPL
| software and then later uses the skills they developed to build
| closed source for profit software should they be sued?
| Phrodo_00 wrote:
| Depends on how closely they reuse the code. Writing it verbatim
| or nearly? Yes.
| jacooper wrote:
| A human doesn't perfectly reproduce the same code he learned
| from.
| buzzy_hacker wrote:
| Copilot is not a person, it is a piece of software.
| thomastjeffery wrote:
| If a software developer writes a program to remember a million
| lines of GPL code, then uses that dataset to "generate" some of
| that code, then they are essentially violating that license
| with extra steps.
|
| The extra steps aren't enough to exhonorate them. It's just a
| convoluted copy operation.
|
| Is just like how a lossy encoding of a song is still - with
| respect to copyright - a copy of that song. The data is totally
| different, and some of the original is missing. It's still a
| derivative work. So is a remix. So is a reperformance.
| protomyth wrote:
| I really feel that Andy Warhol Foundation for the Visual Arts,
| Inc. v. Goldsmith[0] is going to have a big effect on this type
| of thing. They are basically relying on their AI magic to make it
| transformative. I'm starting to think the era of learning from
| material other people own without a license / permission is going
| to end quickly.
|
| 0) https://www.scotusblog.com/case-files/cases/andy-warhol-
| foun...
| sensanaty wrote:
| I personally hope they win, and win big. Anything that ruins
| Micro$oft's day is a boon to mine.
| cothrowaway88 wrote:
| Made a throwaway since I guess this stance is controversial. I
| could not care less about how copilot was made and what kind of
| code it outputs. It's useful and was inevitable.
|
| I'm 1000% on team open source and have had to refer to things
| like tldrlegal.com many times to make sure I get all my software
| licensing puzzle pieces right. Totally get the argument for why
| this litigation exists in the present.
|
| Just saying in general my friends I hope you have an absolutely
| great day. Someone will be wrong on the internet tomorrow, no
| doubt about it. Worry about something productive instead.
|
| This one has the feel of being nothing more than tilting at
| windmills in the long run.
| eurasiantiger wrote:
| Maybe we just need to prompt it to include the proper licenses
| and attributions. /s
| tmtvl wrote:
| Eh, I don't mind Copilot being trained on my code as long as it
| and all projects made using it are licensed under the AGPL.
| karaterobot wrote:
| Does everybody credit the author when using Stack Overflow code?
| I have, but don't always. Not that I'm trying to steal, I just
| don't take the time, especially in personal projects.
|
| This isn't exactly the same thing, but it seems to me that three
| of the biggest differences are:
|
| 1. Stack Overflow code is posted for people to use it (fair
| enough, but they do have a license that requires attribution
| anyway, so that's not an escape)
|
| 2. Scale (true; but is it a fundamental difference?)
|
| 3. People are paying attention in this case. Nobody is scanning
| my old code, or yours, but if they did, would they have a case?
|
| I dunno. I'm more sympathetic to visual artists who have their
| work slurped up to be recapitulated as someone else's work via
| text to image models. Code, especially if it is posted publicly,
| doesn't feel like it needs to be guarded. I'm not saying this is
| _correct_ , just saying that's my reaction, and I wonder why it's
| wrong.
| pmarreck wrote:
| This will fail. Copilot is too good, and only suggests snippets
| or small functions, not entire classes for example.
| naillo wrote:
| I'm kinda sceptical that this goes anywhere given that basically
| they say that whatever copilot outputs is your responsibility to
| vet that it doesn't break any copyright (obviously that goes
| against the promise of it and the PR but that's the small print
| that gets them out of trouble).
| heavyset_go wrote:
| Saying "it's your responsibility to not breach licenses or
| violate copyright" doesn't absolve your service from breaching
| licenses and violating copyright itself.
| mdaEyebot wrote:
| "It is the customer's responsibility to ensure that they only
| drink the water molecules which come out of their tap, and
| not the lead ones."
| golemotron wrote:
| Yet we all use web browsers that copy copyrighted text from
| buffer to buffer all the time. This doesn't even include all
| of the copying that ISPs perform.
|
| It might be fair to say that the read performed in training
| has the same character since no human is involved.
|
| The real copyright violation would be using a derived work.
| heavyset_go wrote:
| A browser isn't a amalgamation of billions of pieces of
| other works. A browser executes and renders code it's
| served.
|
| Copilot's corpus is quite literally tomes of copyrighted
| work that are encoded and compressed in its neural network,
| from which it launders that work to create similar works.
| Copilot itself, the neutral network, is that corpus of
| encoded and compressed information, you can't separate the
| two. Copilot stores and distributes that work without any
| input from rightsholders, and it does it for profit.
|
| A better analogy would be between a browser and a file
| server filled with copyrighted movies whose operator
| charges $10/mo for access. The browser is just a browser in
| this analogy, where the file server is the corpus that
| forms Copilot itself.
| ginsider_oaks wrote:
| the actual copying isn't a problem, it's distribution. if i
| buy access to a PDF i'm not going to get in trouble for
| duplicating the file unless i send it to someone else.
|
| when someone uploads their copyrighted text to a web page
| they are distributing it to whoever visits that page. the
| browser is just the medium.
| golemotron wrote:
| Is that the legal standard in copyright cases?
| [deleted]
| shoshoshosho wrote:
| You could argue that it's the individual projects using copilot
| that are violating here, I guess? Like you can use curl or git
| to dump some AGPL code into your commercial closed software but
| no one would (hopefully) blame those tools.
|
| So copilot is fine but anyone using it must abide by the
| collective set of licenses that it used to write code for
| you...?
| BeefWellington wrote:
| If a license requires attribution, and you reproduce the code
| without attribution using your editor plugin, it seems to me
| the infringement is on the editor plugin.
|
| Note that even licenses like MIT ostensibly require
| attribution.
| dmitrygr wrote:
| So, if i made napster 2.0 and said that it is your job to make
| sure that you do not download anything copyrighted, that would
| be ok?
| charcircuit wrote:
| Yes that would be okay. It would also be okay to create
| Internet 2.0.
| nicolashahn wrote:
| That's basically the situation for any torrent client
| yamtaddle wrote:
| Well, if the trackers also hosted mixed-up blocks of data
| for all the torrents they tracked and their protection was
| "LOL make sure you don't accidentally download any of these
| tiny data blocks in the correct order to reconstruct the
| copyrighted material they may be parts of _wink_ "
| eurasiantiger wrote:
| Isn't that already how everything on the internet works?
| donatj wrote:
| I think it's arguably how anything works. You can have a
| fork, but if you stab it in someones eye that's on you.
| donatj wrote:
| Yep. That's exactly why Bittorrent clients can exist.
| dmalik wrote:
| You mean like every torrent client that currently exists?
| ketralnis wrote:
| I think you're looking for consistency that the legal system
| just doesn't provide. The music industry is more organised
| and litigious than the software industry and that gives them
| power that you and I don't have. If you called it "Napster
| 2.0" specifically you'd probably be prevented from shipping
| by a preliminary injunction. Is that fair or consistent? No.
| But it's the world we live in. Programmers want laws to be
| irrefutable and executable logic but they just aren't.
| [deleted]
| brookst wrote:
| The legal system takes intent into account.
|
| So if you produce napster 2.0 to be the best music piracy
| tool, and you test it for piracy, and you promote it for
| piracy... you're going to have trouble.
|
| If you produce napster 2.0 as a general purpose file sharing
| system, let's call it a torrent client, and you can claim no
| ill intent... you may have trouble but it's a lot more
| defensible in court.
|
| I would find it a big stretch to say Github's intent here is
| to illegally distribute copyrighted code. No judgment on
| whether the class action has any merit, just saying I would
| be very surprised if discovery turns up lots of emails where
| Github execs are saying "this is great, it'll let people
| steal code."
| kube-system wrote:
| > I would find it a big stretch to say Github's intent here
| is to illegally distribute copyrighted code.
|
| Almost everything on GitHub is subject to copyright, except
| for some very old works (maybe something written by Ada
| Lovelace?), and US government works not eligible for
| copyright.
|
| Now, many of the works there are also licensed under
| permissive licenses, but that is only a defense to
| copyright infringement if the terms of those licenses are
| being adequately fulfilled.
| brookst wrote:
| > Almost everything on GitHub is subject to copyright,
|
| Agreed. Like I said, it's about intent. Can anyone say
| with a straight face that copilot is an elaborate scheme
| to profit by duplicating copyrighted work?
|
| I don't think the defense is that it wasn't trained on
| copyrighted data. It obviously was.
|
| I think the defense is that anything, including a person,
| that learns from a large corpus of copyrighted data will
| sometimes produce verbatim snippets that reflect their
| training data.
|
| So when it comes to copyright infringement, are we moving
| the goalposts to where merely learning from copyrighted
| material is already infringement? I'm not sure I want to
| go there.
| jasonlotito wrote:
| Now, IANAL, but iirc, that is all 100% okay and legal. In
| fact, I can even download copyrighted music and movies
| without issue. So, I don't even need to make sure I don't
| download anything under copyright.
|
| The issue isn't downloading copyrighted stuff.
|
| Rather, it's making available and letting others download it.
| That was where you got in trouble.
| heavyset_go wrote:
| Knowingly downloading copyrighted material, say to get it
| for free, still violates the rights of the copyright
| holders. It's just that litigating against members of the
| public is bad PR and not exactly lucrative, especially when
| it's likely that kids downloaded the content.
|
| People used to get busted from buying bootleg VHS and DVDs
| on the street before P2P filesharing was a common thing.
| Then, early on, people were sued for downloading
| copyrighted files before rightsholders decided to take a
| different legal strategy to go after sharers and
| bootleggers.
| heavyset_go wrote:
| This is a bad analogy because P2P networks exist that are
| legal to operate, because Section 230 of the CDA prevents
| interactive computer services from being held responsible for
| user generated content.
|
| What made Napster illegal is that the company did not create
| their network for fair use of content, but to explicitly
| violate copyright for profit.
|
| Copilot is like Napster in this case, in that both services
| launder copyrighted data and distributed it to users for
| profit.
|
| Copilot is not like other P2P networks that exist to share
| data that is either free to distribute or can be used under
| the fair use doctrine. Copilot explicitly takes copyrighted
| content and distributes it to users in violation of licenses,
| that's its explicit purpose.
|
| It's entirely possible to make a Copilot-like product that
| was trained on data that doesn't have restrictive licensing
| in the same way it's entirely possible to create a P2P
| network for sharing files that you have the right to share
| legally.
| stonemetal12 wrote:
| If I remember correctly that only works if you can prove that
| your system has "substantial non infringing use".
| foooobaba wrote:
| If github or google indexes source code using a neural net to
| help you find it, given a query, is that also illegal? If you
| think of copilot as something that helps you find code you're
| looking for, is it all that different, and if so, why?
|
| In this case, wouldn't the users of copilot be the ones
| responsible for any copyrighted code they may have accessed using
| copilot?
| leni536 wrote:
| Both services already accept DMCA notices to take content down.
| foooobaba wrote:
| True, that's another good point.
| lbotos wrote:
| The crux of the issue: Is the code that is being generated
| being used in a way that it's license allows? That's it. I'm
| confident that this problem would go away if copilot said:
|
| //below output code is MIT licensed (source: github/repo/blah)
|
| And yes, the "users" are responsible, but it's possible that
| copilot could be implicated in a case depending on how it's
| access is licensed.
|
| Stable diffusion has this same problem btw, but in visual arts
| "fair use" is even murkier.
|
| For code, if you could use the code and respect the license,
| why wouldn't you? Copilot takes away that opportunity and
| replaces it with "trust us".
| foooobaba wrote:
| This makes sense, it produces chunks not the whole source
| where a search engine would also give you the license.
| arpowers wrote:
| The proper way to think about these LLM is similar to plagiarism.
|
| Seems to me the underlying data should be opt-in from creators
| and licenses should be developed that take AI into consideratiin.
| thesuperbigfrog wrote:
| How original is the generated code?
|
| Can the generated code be traced back to the code used for
| training and the original copyrights and licenses for that code?
|
| If so, what attribution(s) and license(s) should apply to the
| generated code?
| dmitrygr wrote:
| They demonstrate generated code being _identical_ to some
| training code.
| Swizec wrote:
| How many ways are there to write many of the basic algorithms
| we all use though? Can I copyright "({ item }) =>
| <li>{item.label}</li>"?
|
| Because I sure have seen that exact code written, from
| scratch, in many _many_ places.
|
| I guess my question boils down to _" What is the smallest
| copyrightable unit of code?"_. Because I'm certain suing a
| novelist for copyright infringement on a character that says
| "Hi, how are you?" would be considered absurd.
| googlryas wrote:
| No specific sources to provide, but a lot of analyses were
| written about this question regarding the Google v Oracle
| java API lawsuit.
| avian wrote:
| There were well known examples of copilot reproducing exact
| code snippets well before this lawsuit (e.g. the Quake's fast
| inverse square root function). Microsoft dealt with them by
| simply adding the offending function names to a blocklist.
|
| In other words, if your open source project doesn't have such
| immediately recognizable code and didn't cause a shitstorm on
| Twitter, chances are copilot is still happily spewing out
| your exact code, sans the copyright and license info.
| m00x wrote:
| Just like developers have _never_ copy-pasted code from stack
| overflow or Github :):):)
| ggerganov wrote:
| omnimus wrote:
| Always consider that maybe you don't fully understand what it
| actually does.
| [deleted]
| pvg wrote:
| _Please don 't sneer, including at the rest of the community._
|
| https://news.ycombinator.com/newsguidelines.html
| sirsinsalot wrote:
| That's not really right.
|
| Copilot isn't just "displaying" something. Copilot has mined
| the collective efforts of developers in an effort to produce
| derivative works, without permission, re-distributing that
| value without giving anything back.
|
| It'd be like suing Adobe because photoshop comes bundled with a
| your holiday photos, without permission, and uses those to in a
| "family photos" filter.
|
| Large scale mining of value and then selling it without due
| credit or reward for those you stole that value from is plain
| theft.
| finneganscat wrote:
| spir wrote:
| The part of GitHub Copilot to which I object is that it's trained
| on private repos. Where does GitHub get off consuming explicitly
| private intellectual property for their own purposes?
| RamblingCTO wrote:
| lol @ "open-source software piracy"
|
| If I'm being honest I'm a bit annoyed at this. What's the problem
| and what's the point of this?
| opine-at-random wrote:
| If you'd ever read even a single one of the licenses to the
| software I'm sure you use everyday, you'd understand. This is
| such an obvious and pathetic strawman.
|
| I notice often on hackernews that people don't seem to
| understand anything about free or open-source software outside
| of the pragmatics of whether they can abuse the work for free.
| bpodgursky wrote:
| Lawyers want $$$$.
| RamblingCTO wrote:
| Yeah I guess so. This website reads like bullshit bingo from
| some weird twitter dude trying to sell you his newest
| product:
|
| "AI needs to be fair & ethical for everyone. If it's not,
| then it can never achieve its vaunted aims of elevating
| humanity. It will just become another way for the privileged
| few to profit from the work of the many."
|
| Blah blah. Can we get back to the hacking on stuff mentality?
| gcmrtc wrote:
| Looks like that lawyer guy is not new on hacking stuff:
| https://matthewbutterick.com/
|
| Not exactly the curriculum of a twitter weirdo.
| RamblingCTO wrote:
| Hah, funny. I've used Pollen before and think I've had
| contact with him a few years ago! The blah blah about AI
| elevating the world is still bs imho. I still disagree
| with his views (https://matthewbutterick.com/chron/this-
| copilot-is-stupid-an...) and this law suit.
|
| I wasn't actually talking about him specifically btw when
| saying "this sounds like a crypto bro from twitter". The
| overly enthusiastic AI talk reminded me of that, that's
| what I wanted to say.
| finneganscat wrote:
| albertzeyer wrote:
| I really don't understand how there can be a problem with how
| Copilot works. Any human just works in the same way. A human is
| trained on lots and lots of of copyrighted material. Still, what
| a human produces in the end is not automatically derived work
| from all the human has seen in his life before.
|
| So, why should an AI be treated different here? I don't
| understand the argument for this.
|
| I actually see quite some danger in this line of thinking, that
| there are different copyright rules for an AI compared to a human
| intelligence. Once you allow for such arbitrary distinction, it
| will get restricted more and more, much more than humans are, and
| that will just arbitrarily restrict the usefulness of AI, and
| effectively be a net negative for the whole humanity.
|
| I think we must really fight against such undertaking, and better
| educate people on how Copilot actually works, such that no such
| misunderstanding arises.
___________________________________________________________________
(page generated 2022-11-03 23:01 UTC)