[HN Gopher] AI-powered open-source code laundering
       ___________________________________________________________________
        
       AI-powered open-source code laundering
        
       Author : genkiuncle
       Score  : 114 points
       Date   : 2025-10-04 23:26 UTC (23 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | ebcode wrote:
       | not hard to believe. I've been using claude code and am hesitant
       | to publish publicly because I'm concerned about copyright
       | violations. It would be nice if there were a registry (besides
       | github) where I could compare "new" code against public
       | repositories.
        
         | adastra22 wrote:
         | Why? That's not how copyright works.
        
       | CuriouslyC wrote:
       | Sorry to say but this is going to be the new normal, and it's
       | going to be quite difficult to stop. Your moat as a creator is
       | your personal brand and the community you build around your
       | tools.
        
         | o11c wrote:
         | I just hope that means we're all allowed to feed leaked source
         | code to our own AIs then. This is _mandatory_ if we 're to have
         | any sort of coherent legal precedent.
        
           | ares623 wrote:
           | Game crackers can just claim they generated a completely
           | different game using AI that just so happens to look very
           | close to another game?
        
             | CuriouslyC wrote:
             | They could copy the core game mechanics and have AI launder
             | the source and generate new art assets. Proving
             | infringement is going to be basically impossible for all
             | but the most trivial of cases.
        
               | ares623 wrote:
               | The same could be done for movies too I guess. Probably
               | easier.
               | 
               | One can setup a site to crowdsource laundering 8-10
               | second sections of an entire movie and then stitching it
               | back.
        
         | throwaway290 wrote:
         | this is a blatant try to normalize. "Bad people do unethical
         | things, I guess we'll have to live with it and shut up" is the
         | vibe
         | 
         | the author is going good. it's not a new normal until everybody
         | goes quiet
        
           | pessimizer wrote:
           | > this is a blatant try to normalize.
           | 
           | This doesn't mean anything. You have no ability to
           | "normalize" anything. It's not an action that somebody can
           | take.
           | 
           | > it's not a new normal until everybody goes quiet
           | 
           | Real let me speak to your manager energy. Nobody is waiting
           | for you to go quiet to get on with things.
        
             | akoboldfrying wrote:
             | > You have no ability to "normalize" anything.
             | 
             | Normalisation isn't something that one person by themselves
             | can achieve. It only happens when public opinion is swayed.
             | How is it swayed? By people deliberately trying to sway it,
             | like GP here.
             | 
             | If you are instead arguing that normalisation is not really
             | a thing at all: What do you call the change in attitudes to
             | people who are left-handed, disabled, or homosexual?
        
             | throwaway290 wrote:
             | > You have no ability to "normalize" anything.
             | 
             | You can if you convince everyone to stop making a fuss
             | because it's the new normal. The comment literally said
             | "it's the new normal".
        
           | CuriouslyC wrote:
           | This is a very bad faith comment from a throwaway account.
           | 
           | Recognition of realities is different from wishing for things
           | to occur. If you think you can stop unethical people from AI
           | washing your software, feel free to try, you will fail.
        
             | throwaway290 wrote:
             | Bad faith = trying to normalize bad faith behavior.
             | 
             | > If you think you can stop unethical people from AI
             | washing your software, feel free to try, you will fail.
             | 
             | Posts like these = trying to stop unethical people from
             | copyright (copyleft) washing. Telling people writing these
             | posts that it's the new normal is basically saying they are
             | doing pointless thing, while they are doing something very
             | good
        
               | CuriouslyC wrote:
               | Stop being a coward and have a discussion with me with
               | your real identity.
               | 
               | You can whine about something till you're dead, but
               | incentives drive actions, full stop. Instead of whining
               | about normalization, lobby lawmakers to make actual
               | change and build tools to help creators detect the issue.
        
       | userbinator wrote:
       | Hopefully the spread of AI will make more people realise that
       | everything is a derivative work. If it wasn't an AI, it was a
       | human standing on the shoulders of giants.
        
         | CamperBob2 wrote:
         | I'll give you the only upvote you'll probably get for that
         | sentiment around here. Enjoy your trip to -4 (Dead)!
        
         | hu3 wrote:
         | This. AI is a magnificent way to make the entire world's
         | codebase available as a giant, cross-platform, standard
         | library.
         | 
         | I welcome AI to copy my crap if that's going to help anyone in
         | the future.
        
           | alganet wrote:
           | You forgot to mention that if things continue as they are, a
           | very small group of people will have complete control over
           | this giant library.
        
             | hu3 wrote:
             | It's a concern. But there are open source models.
        
               | zdwolfe wrote:
               | I find it odd that any LLM could be considered open
               | source. Sure the weights are available to download and
               | use, but you can't reasonably reconstruct the output
               | model as it's impractical for an individual to gather a
               | useful dataset or spend $5,000,000+ of GPU time training.
        
               | jsight wrote:
               | Distillation can extract the knowledge from an existing
               | model into a newly trained one. That doesn't solve the
               | cost problem, but costs are steadily coming down.
        
               | goku12 wrote:
               | That's still a crude repurposement of an inscrutable
               | artifact. Open source requires you to share the source
               | data from which that artifact (the model parameters) was
               | created.
        
               | alganet wrote:
               | No, there aren't.
               | 
               | There is open source training and inference software. And
               | there are open weights.
               | 
               | Those things are not enough to reproduce the training.
               | 
               | Even if you had the hardware, you would not be able to
               | recreate llama (for example) because you don't know what
               | data went into the training.
               | 
               | That's a very weird library. You can get their summaries,
               | but you don't have access to the original works used when
               | creating it. Sounds terrible, open source or not.
        
               | vineyardmike wrote:
               | Open source model, created at great expense... by a still
               | small cohort of people.
               | 
               | There are like a dozen organizations globally creating
               | anything close to state of the art models. The fact that
               | you can use some for free on your own hardware doesn't
               | change that those weights were trained by a small cohort
               | of people, with training data selected by those people,
               | and fine-tuning and "alignment" created by those people.
               | 
               | Sure you can fine-tune the smaller ones yourself, but
               | that still leaves you at the will of original creator.
        
               | ares623 wrote:
               | Are you able to build these models from source?
        
           | beeflet wrote:
           | Except closed source software which it isn't trained on.
        
         | smj-edison wrote:
         | Yeah, this is where I find the copyright argument a little
         | weak. Because how do artisans learn their craft? By observing
         | others' work.
         | 
         | Instead, I feel like the objections are (rightly) these two
         | issues:
         | 
         | 1. GenAI operates at a much larger scale than an individual
         | artist. I don't think artists would have an issue with someone
         | commissioning a portrait say in the style Van Gogh (copyright
         | argument). They would have an issue if that artist painted
         | 100,000 pictures a day in the style of Van Gogh.
         | 
         | 2. Lack of giving back: some of the greatest artists have
         | internalized great art from previous generations, and then
         | something miraculous happens. An entirely new style emerges.
         | They have now given back to the community that incubated them.
         | I don't really see this same giving back with GenAI.
         | 
         | Edit: one other thought. Adobe used their own legally created
         | art to train their model, and people still complain about it,
         | so I don't buy the copyright argument if they're upset about
         | Adobe's GenAI.
         | 
         | Edit 2: I'm not condoning blatant copyright infringement like
         | is detailed in this post.
        
           | alganet wrote:
           | Copyright is a nightmare. It's just that it sounds like a
           | gentler nightmare than hyperscaled algorithms controlled by a
           | few.
        
           | charcircuit wrote:
           | >Lack of giving back
           | 
           | I disagree. There is a ton of free AI generated text, code,
           | images, and video available for completely free for people to
           | learn from.
        
             | chrisldgk wrote:
             | Which is just laundered from real material that real humans
             | put work in to create, only to be regurgitated by a krass
             | homonculous of 1s and 0s for free without any mention of
             | the real work that has been put into creating that
             | information.
             | 
             | I'm not a big fan of the copyright system we have myself,
             | but there's a reason it exists. AI companies illegally
             | training their AI on copyrighted content to reap the spoils
             | of the hard work of other people that never get recognition
             | for their work is the opposite of ,,giving back".
        
           | visarga wrote:
           | 1. If I wanted the "style of Van Gogh" I would simply
           | download Van Gogh, why waste time and money on approximative
           | AI. But if I want something Else, then I can use AI. But Gen
           | AI is really the worst infringement tool, for example would
           | anyone try to read bootleg Harry Potter from a LLM to avoid
           | payment? Don't think so.
           | 
           | 2. LLMs will give back what you put in + what they learned,
           | it's your job to put in the original parts. But every so
           | often this interaction will spark some new ideas. The
           | LLM+human team can get where neither of them would get alone,
           | building on each other's ideas.
        
           | bluefirebrand wrote:
           | > Because how do artisans learn their craft? By observing
           | others' work
           | 
           | I don't think that computer systems of any kind should have
           | the same right to fair use that humans have
           | 
           | I think humans should get fair use carve outs for fanart and
           | derivative work, but AI should not
        
         | add-sub-mul-div wrote:
         | Nothing subverts my defense of human creativity more than the
         | cliched human defenses of AI.
        
           | monero-xmr wrote:
           | For those of us who exceed the AI, it raises our value
           | enormously. You see it in the pay of the AI engineers. But in
           | the high interest rate world, those of us who continue to be
           | employed, are commanding higher wages, as far as I can tell.
           | It is a culling of the lesser-than.
           | 
           | One unfortunate side-effect is the junior engineers who
           | cannot immediately exceed the AI are not being hired as
           | often. But this era echos the dotcom boom, where very low-
           | skilled people commanded very-high wages. Universities, which
           | have always been white collar job training but pretended they
           | weren't, are being impacted greatly.
           | 
           | https://registrar.mit.edu/stats-reports/majors-count
           | 
           | 24% of undergraduate MIT students this year have Computer
           | Science in the title (I asked chatgpt to calculate this from
           | the difficult-to-parse website). 1/4 of all MIT
           | undergraduates are not being trained to be future PhD
           | researchers - they, like all other schools, are training the
           | vast majority of their students for private sector workforce
           | jobs.
           | 
           | The culling is happening all over. We will likely go down to
           | < 1000 colleges in America from 4000 now over the next 15
           | years.
           | 
           | This is a good thing. The cost of university degrees is far
           | too high. We are in the midst of a vast transition. College
           | should return to being the purview of the truly intelligent
           | and the children of the rich, as it was for all time before
           | WW2. This very weird experiment in human history is ending,
           | and it cannot happen soon enough
        
             | sciencejerk wrote:
             | > _College should return to being the purview of the truly
             | intelligent and the children of the rich, as it was for all
             | time before WW2._
             | 
             | You're likely correct that we're witnessing a
             | reconsolidation of wealth and the extinction of the middle
             | class in society, but you seem happy about this? Be careful
             | what you wish for...
        
               | monero-xmr wrote:
               | Alternatively, _all_ middle class jobs do not require a
               | college degree. Perhaps a college degree is primarily a
               | signalling mechanism for adherence to a bygone era of
               | societal norms. But the price is far too high to justify
               | it, and the market will create alternative proof of
               | societal norms, at a far cheaper price. Which is
               | happening as we debate.
               | 
               | My concern now is a large number of under-employed
               | college graduates who are indebted to worthless degrees,
               | feeling pinched because the debt far surpasses their
               | market value. This has been the case for a long time, but
               | has now reached the upper-echelons of academia where even
               | Ivy league grads cannot get employment. You need to re-
               | calibrate your ire to the correct target
        
               | novemp wrote:
               | Yeah, sure, not every job should require a degree, but
               | that doesn't justify keeping The Poors from pursuing
               | education.
               | 
               | Some of us value education for its own sake, not as a
               | prerequisite for employment.
        
               | monero-xmr wrote:
               | You are assuming the only avenue to "education" is
               | through the university experience
        
               | novemp wrote:
               | Some people learn best in structured class settings.
        
               | sciencejerk wrote:
               | If AI and other societal shifts eliminate many white-
               | collar jobs in developed countries, degree-seekers will
               | eventually notice and the demand and perceived value of a
               | college education may greatly diminish. I got my degree
               | at a time when it was actually useful as a signalling
               | mechanism. Now students might not benefit much from a
               | college degree and internships might be hard to find.
               | This is too bad and grossly unfair.
               | 
               | I hope that new societal avenues are created to help
               | young people start their careers, even if those careers
               | are in fields like plumbing, nursing and hospitality. I
               | also hope efforts are made to help white collar workers
               | transition into other (lesser-paying) careers when AI
               | really starts to permanently reduce the size of the
               | white-collar workforce.
        
               | sciencejerk wrote:
               | > _You need to re-calibrate your ire to the correct
               | target_
               | 
               | Who do you think is the correct target? Big institutions?
               | The college system?
        
               | monero-xmr wrote:
               | Definitely the colleges, which charge more year after
               | year, burdening the young with debt for worthless pieces
               | of paper.
        
               | ares623 wrote:
               | They probably think they're one of the "truly intelligent
               | and children/parent of the rich" lol
        
               | monero-xmr wrote:
               | I would not want the unintelligent and non-rich to go
               | into debt to spend 4 years at a university, getting a
               | degree in a subject which is absurd
               | 
               | https://www.sps.nyu.edu/explore/degrees-and-programs/bs-
               | in-h...
               | 
               | Please, tell me how going $300k in debt for an
               | undergraduate degree in Tourism Studies benefits society,
               | or the student
        
               | ares623 wrote:
               | Sounds like a US problem tbh
        
               | monero-xmr wrote:
               | Nearly every European university offers a degree in
               | Tourism. The difference is regarding debt. But
               | socializing the cost of a _degree in tourism_ does not
               | mean the cost isn't born by society. I believe deep in my
               | bones that one can learn the ropes of managing a hotel
               | outside the illustrious grounds of a university
        
               | ares623 wrote:
               | Tbh it just sounds like you don't value others and
               | others' work that isn't a "hard" field.
               | 
               | Sure people can learn hotel management outside of
               | university. But outside of nepotism who will trust random
               | strangers with no qualifications to get their foot in the
               | door?
               | 
               | And you make it sound like socializing the cost of
               | improving outcomes for the next generations as a
               | negative. What is the point if society if not that? Even
               | from a purely selfish perspective, The next generation
               | will take care of me when I am too old to do it myself.
               | I'd want them to be in a good state by then
               | 
               | Do you think you got to wherever you are now without some
               | part of socialized cost of society getting you there?
        
               | monero-xmr wrote:
               | "Tourism studies" isn't a field. It's not an academic
               | discipline. Requiring someone to spend years "studying"
               | this is completely absurd. The reality is university is
               | finishing school, and young adults desire 4 years of
               | screwing around while officially getting a degree, and
               | society subsidies it.
               | 
               | You have missed my point entirely. These degrees have no
               | value. I would argue they have negative value when
               | factoring in their cost in resources and wasted time
        
             | card_zero wrote:
             | 35%, ignoring "secondary majors" which may or may not
             | coincide with primary majors that also have CS in the
             | title.
             | 
             | (Also ignoring the thousand first years at the end of the
             | list.)
             | 
             | The various 0.5 half-student quantities throw some doubt on
             | the measurement too.
        
             | teiferer wrote:
             | > College should return to being the purview of the truly
             | intelligent and the children of the rich, as it was for all
             | time before WW2.
             | 
             | Yeah, the world was a better place when it was mostly white
             | males having that chance.
             | 
             | /s
        
         | as1mov wrote:
         | The offending repository is copying files verbatim while
         | removing off the license header from the said files. It's not
         | "standing on the shoulder of giants".
        
           | typpilol wrote:
           | That doesn't even seem like ai but just direct copy pasting
           | lol
        
             | em-bee wrote:
             | looks to me like they are using AI to refactor the code,
             | not to generate it. even if we allow code to be used to
             | train AI to generate new code, copying code and refactoring
             | it is something entirely different.
        
         | croes wrote:
         | AI makes it easy for others to claim they did the work so
         | others are less likely to do the real work. Means the giants
         | won't grow.
        
         | dvfjsdhgfv wrote:
         | This is a very intriguing statement because it looks like it
         | contains a truism but something is off. Yes, everything is a
         | derivative work of some kind, what matters is the amount of
         | added value - if it gets close to 0 as in this case, we've got
         | bare plagiarism.
         | 
         | [As a side note, the problem with LLMs (sorry, the term "AI"
         | became so muddy I prefer not to use it) is that they tend to be
         | extremely uncreative and just average to mean. So I wouldn't
         | expect added value in creativity itself, just helping humans
         | with more menial tasks just like antirez is doing.]
        
       | ugh123 wrote:
       | > Please DO NOT TURST ANY WORD THEY SAY. They're very good at
       | lingual manipulation.
       | 
       | I don't know if this was intentional misspelling or not but it's
       | damn funny
        
         | josfredo wrote:
         | It is likely intentional as the author is battling AI with many
         | means possible. However it leans towards funny and hopeless at
         | the same time.
        
       | dvrp wrote:
       | This is the new reality. Information in the form of raw entropy
       | encoded in weights--it doesn't matter if it's text, image, video,
       | or 3D. Assets (or formerly known as assets) now belong to the big
       | labs, if it's on the internet.
       | 
       | Internet plus AI implies the tragedy of the commons manifested in
       | the digital world.
        
       | arthurofbabylon wrote:
       | If we step back and examine LLMs more broadly (beyond our
       | personal use cases, beyond "economic impact", beyond the
       | underlying computer science) what we are largely looking at is an
       | emerging means of collaboration. I am not an expert computer
       | scientist, and yet I can "collaborate" (I almost feel bad using
       | this term) with expert computer scientists when my LLM helps me
       | design my particular algorithm. I am not an expert on Indonesian
       | surf breaks, yet I tap into an existing knowledge base when I
       | query my LLM while planning the trip. I am very naive about a lot
       | of things and thankfully there are numerous ways to integrate
       | with experts and improve my capacity to engage in whatever I am
       | naive about, LLMs offering the latest ground-breaking method.
       | 
       | This is the most appropriate lens through which to assess AI and
       | its impact on open source, intellectual property, and other
       | proprietary assets. Alongside this new form of collaboration
       | comes a restructuring of power. It's not clear to me how our
       | various societies will design this restructuring (so far we are
       | collectively doing nearly nothing) but the restructuring of these
       | power structures is not a technical process; it is cultural and
       | political. Engineers will only offer so much help here.
       | 
       | For the most part, it is up to us to collectively orchestrate the
       | new power structure, and I am still seeing very little literature
       | on the topic. If anyone has a reading list, please share!
        
         | visarga wrote:
         | > what we are largely looking at is an emerging means of
         | collaboration.
         | 
         | They surpass open source, "out-open source-opensouce" by
         | learning skills everywhere and opening them up for anyone who
         | needs them later.
        
           | goku12 wrote:
           | It's owned by a few rich corporations and individuals. It
           | isn't available to anyone - only to those they choose and are
           | ready to pay them. And it isn't open source at all, because
           | open source is not reuse without any obligations (even under
           | permissive licenses). And let's not forget that they 'open'
           | only FOSS works and individual works. They never expose
           | proprietary IP belonging to rich corporations. It isn't an
           | emerging method of collaboration - it's another method for
           | wealth consolidation.
        
         | 48terry wrote:
         | Me copying and pasting your post verbatim to put on my blog
         | under my name: "I greatly enjoyed collaborating with
         | arthurofbabylon on this piece."
        
       | foxylad wrote:
       | This will kill open source. Anything of value will be derived and
       | re-derived and re-re-derived by bad players until no-one knows
       | which package or library to trust.
       | 
       | The fatal flaw of the open internet is that bad players can
       | exploit with impunity. It happened with email, it happened with
       | websites, it happened with search, and now it's happening with
       | code. Greedy people spoil good things.
        
         | awesome_dude wrote:
         | If this was true, why hasn't it happened for the last... 30 or
         | 40 years that FOSS code has been published on the internet
        
           | ares623 wrote:
           | Last i checked LLMs didn't exist until only a few years ago
        
           | makeitdouble wrote:
           | Copyright was the base protection layer. Not in the "I own
           | it" sense, but in the "you can't take it and run with it"
           | sense.
           | 
           | With the current weakening of it, it opens the door to abuses
           | that we don't have the proper tools to deal with now. Perhaps
           | new ones will emerge, but we'll have to see.
        
           | croes wrote:
           | Same reason why fake images and videos are now more.
           | Photoshop existed 30 years ago.
           | 
           | Before LLM you needed time and abilities to do it, with AI
           | you need less of both.
        
           | trod1234 wrote:
           | Until now, people have had the leverage/cost asymmetry in
           | their favor where they could easily differentiate and make
           | rational choices.
           | 
           | AI has tipped that nuanced balance in a way that is both
           | destructive, and unsustainable. Just like any other fraud or
           | ponzi.
           | 
           | Cost/loss constraint function now favors the unskilled,
           | blind, destructive individual running an LLM who spits on all
           | those that act with good faith. Quite twisted.
        
         | billy99k wrote:
         | This was always the case with open source. It's not that hard
         | to obfuscate code in compiled binaries.
        
       | cientifico wrote:
       | The license was MIT until two months ago.
       | 
       | That gives anyone the right to get the source code of that commit
       | and do whatever.
       | 
       | The article does not specified if the company is still using the
       | code AFTER the license change.
       | 
       | The rest of the points are still valid.
        
       | laurex wrote:
       | I'm interested in a new kind of license which I'm calling
       | "relational source" - not about money or whether a product is
       | commercial but instead if there's an actual person who wants to
       | use the code with some kind of AGPL-esque mechanism to ensure no
       | mindless ingestion- perhaps this would never work but it's also
       | breaking the spirit of everything I love about OSS to have AI
       | erasing the contributions of the people who put their time into
       | doing the work.
        
       | haebom wrote:
       | Wouldn't "Pretending It's Mine" be a better name for the project?
        
       | pjfin123 wrote:
       | Is the allegation here that a LLM generated code that was very
       | similar to the author's copyright protected code or that they
       | copied the code and then tried to use AI to hide that fact?
        
       ___________________________________________________________________
       (page generated 2025-10-05 23:01 UTC)