[HN Gopher] AI Data Laundering
       ___________________________________________________________________
        
       AI Data Laundering
        
       Author : marceloabsousa
       Score  : 44 points
       Date   : 2022-10-17 21:26 UTC (1 hours ago)
        
 (HTM) web link (waxy.org)
 (TXT) w3m dump (waxy.org)
        
       | dkural wrote:
       | This reminds me of the Jedi Mind trick of Uber of waving a
       | smartphone to argue that labor & other laws all of a sudden don't
       | apply to them, to the detriment of the public that'll now
       | shoulder the costs.
        
       | moyix wrote:
       | The Authors Guild v Google decision about Google Books seems
       | relevant:
       | 
       | > In late 2013, after the class action status was challenged, the
       | District Court granted summary judgement in favor of Google,
       | dismissing the lawsuit and affirming the Google Books project met
       | all legal requirements for fair use. The Second Circuit Court of
       | Appeal upheld the District Court's summary judgement in October
       | 2015, ruling Google's "project provides a public service without
       | violating intellectual property law." The U.S. Supreme Court
       | subsequently denied a petition to hear the case.
       | 
       | [...]
       | 
       | > The court's summary of its opinion is:
       | 
       | [...]
       | 
       | > Google's unauthorized digitizing of copyright-protected works,
       | creation of a search functionality, and display of snippets from
       | those works are non-infringing fair uses. The purpose of the
       | copying is highly transformative, the public display of text is
       | limited, and the revelations do not provide a significant market
       | substitute for the protected aspects of the originals. _Google's
       | commercial nature and profit motivation do not justify denial of
       | fair use._
       | 
       | https://en.wikipedia.org/wiki/Authors_Guild,_Inc._v._Google,....
       | 
       | This doesn't touch on the ethics of course - at minimum I think
       | allowing people to exclude themselves or their work from a
       | dataset is necessary.
        
         | echelon wrote:
         | Do we allow artists to withhold their works from the minds of
         | eager, learning children? [1]
         | 
         | Tell me how ML is different than the mind of a toddler ravenous
         | for new information.
         | 
         | For every billion dollar start-up using data at scale, there
         | are tens of thousands more researchers and hobbyists doing the
         | exact same, producing wonderful results and advances.
         | 
         | If we stop this growth dead in the tracks, other countries more
         | willing to look past the IP laws will jump ahead. And if
         | Stability locks away their secret sauce, some new party will
         | come and give away the keys to the kingdom yet again.
         | 
         | You can't block the signal. Except, of course, by legislating
         | against it in some Luddite hope we can prevent the future from
         | happening.
         | 
         | Instead of worrying careers will end, we should look at this as
         | being the end of specialization. No longer do we need to pay
         | 20,000 hours to learn one thing to the exclusion of all others
         | we would like to try. Now we'll be able to clearly articulate
         | ourselves with art, music, poetry. We'll become powerful beings
         | of thought and expression.
         | 
         | Humans aren't the end or the peak of evolution. We should be
         | excited to watch this unfold.
         | 
         | [1] Maybe Disney would like you to pay more for a premium
         | learning plan for your child, but thankfully that's not (yet)
         | possible.
        
           | greysphere wrote:
           | Most machine learning is assigning weights in a chain of
           | matrix multiplications and normalization functions.
           | 
           | There is no known model of toddlers' brains, let alone one
           | based on matrix multiplication and normalization. Developing
           | such a model would be a noteworthy achievement.
           | 
           | Therefore these are different.
        
           | nightski wrote:
           | Well a toddler isn't making money off the information they
           | are absorbing for one. If these are open to the public models
           | that is one thing. But no, these are proprietary models whose
           | sole purpose is to make money for large corporations.
        
           | 9wzYQbTYsAIc wrote:
           | > Tell me how ML is different than the mind of a toddler
           | ravenous for new information.
           | 
           | If a person published a work that clearly plagiarized or
           | violated a patent, that person would be open to legal action.
           | 
           | I'm all for systemic change, but uses like this may end up
           | having a chilling effect on human-created work.
        
         | 9wzYQbTYsAIc wrote:
         | > _the revelations do not provide a significant market
         | substitute for the protected aspects of the originals_
         | 
         | It does seem like generative AI systems provide a significant
         | market substitute, so this ruling probably wouldn't apply, in
         | court.
         | 
         | edit: see https://news.ycombinator.com/item?id=33194623 for
         | some initial thoughts on how this problem (and others) could be
         | rectified.
         | 
         | For example, with a database of protected works and self-
         | censorship algorithms for generative AI systems,
         | conscientiously objecting creatives could have a mechanism for
         | excluding their works.
        
           | tpmoney wrote:
           | A substitute for what though? Copyright law is only concerned
           | with substituting the work under copyright. That is to say,
           | the consideration is whether the infringing aspects of the
           | secondary work would alter the demand and market for the work
           | being infringed.
           | 
           | In all the talk about AI data laundering there really hasn't
           | been any indication that the AI generated item substitutes
           | for the item it's alleged to infringe on. Substituting for a
           | whole profession and its practitioners doesn't enter into the
           | concerns of copyright law. There might be some argument that
           | it should (to "promote the progress of science and useful
           | arts" as it were), but copyright law to my knowledge hasn't
           | been used to prevent new tech from putting professionals as a
           | whole out of business.
        
             | 9wzYQbTYsAIc wrote:
             | Stock photography seems to be the obvious instance - why
             | bother paying for the labor to make a stock photo, when you
             | can have a generative AI system create the photo for you?
             | 
             | And furthermore, has anyone demonstrated that it is or is
             | not possible to fully, or substantially, recreate any given
             | existing work using the right input prompts?
             | 
             | I'm interested to know more of the legal details, but my
             | understanding of copyright law is such that it preserves
             | the value of intellectual labor.
        
         | authpor wrote:
         | > _I think allowing people to exclude themselves or their work
         | from a dataset is necessary._
         | 
         | or they could open it all up for everybody and stop protecting
         | the rights of death people (authors dead less then 70 years
         | ago)
         | 
         | then again, that will make the publishers starve... but why
         | pretend publishing corporations need food?
        
           | ad404b8a372f2b9 wrote:
           | This is larger than publishers, this is every artist, film-
           | maker, photographer, every writer, every engineer, anybody
           | who has ever written or created something and shared it
           | publicly is liable to have their work assimilated and an
           | infinite amount of derivatives produced with no control over
           | how they're used and by whom.
           | 
           | Comment generated with gpt-neox prompt: Comment about AI and
           | data collection and generation and its pitfalls, expressing
           | concern, emphasis on professions, emphasis on automation,
           | written by Stephen King, creative writing, award winning,
           | trending on reddit, trending on hacker news, written by Greg
           | Rutkowski, written by Zola, written by Voltaire, written by
           | authpor, written by moyix.
           | 
           | (Just kidding, it wasn't AI generated but you see my point.)
        
             | mkaic wrote:
             | > anybody who has ever written or created something and
             | shared it publicly is liable to have their work assimilated
             | and an infinite amount of derivatives produced with no
             | control over how they're used and by whom.
             | 
             | This has been the case ever since people started putting
             | their art on the Internet publicly. The only difference is
             | that now it's algorithms creating the derivatives, not
             | people.
        
               | ad404b8a372f2b9 wrote:
               | This is not remotely the same, scale and barrier to entry
               | matter. With stable diffusion I can pick any artist right
               | now and create over 1000 derivative works by tomorrow
               | morning in his style to the same degree of expertise with
               | no training involved and no work required.
        
             | authpor wrote:
             | this is larger than the arts. anybody has ever participated
             | creatively in our culture understands that it's absolute
             | bullshit to pretend we need money in order to want to
             | contribute artistically.
             | 
             | we need money because food is for sale, because most of us
             | do not own where we live hence we are forced (a priori) to
             | come up with a whole lot of money every month or else
             | you're out in the streets.
        
               | ad404b8a372f2b9 wrote:
               | Sure but unless you bring down capitalism people will
               | still need to work to eat and most will want to use their
               | hard-earned creative skills to make a living.
               | 
               | Not only that but being able to dedicate 8 to 10 hours a
               | day to your craft for 40 years bring it to a level that
               | you can't reach with casual practice.
        
       | killjoywashere wrote:
       | > It's currently unclear if training deep learning models on
       | copyrighted material is a form of infringement
       | 
       | What? It's clearly a derived work.
        
       ___________________________________________________________________
       (page generated 2022-10-17 23:00 UTC)