[HN Gopher] AI-powered open-source code laundering
___________________________________________________________________
AI-powered open-source code laundering
Author : genkiuncle
Score : 114 points
Date : 2025-10-04 23:26 UTC (23 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| ebcode wrote:
| not hard to believe. I've been using claude code and am hesitant
| to publish publicly because I'm concerned about copyright
| violations. It would be nice if there were a registry (besides
| github) where I could compare "new" code against public
| repositories.
| adastra22 wrote:
| Why? That's not how copyright works.
| CuriouslyC wrote:
| Sorry to say but this is going to be the new normal, and it's
| going to be quite difficult to stop. Your moat as a creator is
| your personal brand and the community you build around your
| tools.
| o11c wrote:
| I just hope that means we're all allowed to feed leaked source
| code to our own AIs then. This is _mandatory_ if we 're to have
| any sort of coherent legal precedent.
| ares623 wrote:
| Game crackers can just claim they generated a completely
| different game using AI that just so happens to look very
| close to another game?
| CuriouslyC wrote:
| They could copy the core game mechanics and have AI launder
| the source and generate new art assets. Proving
| infringement is going to be basically impossible for all
| but the most trivial of cases.
| ares623 wrote:
| The same could be done for movies too I guess. Probably
| easier.
|
| One can setup a site to crowdsource laundering 8-10
| second sections of an entire movie and then stitching it
| back.
| throwaway290 wrote:
| this is a blatant try to normalize. "Bad people do unethical
| things, I guess we'll have to live with it and shut up" is the
| vibe
|
| the author is going good. it's not a new normal until everybody
| goes quiet
| pessimizer wrote:
| > this is a blatant try to normalize.
|
| This doesn't mean anything. You have no ability to
| "normalize" anything. It's not an action that somebody can
| take.
|
| > it's not a new normal until everybody goes quiet
|
| Real let me speak to your manager energy. Nobody is waiting
| for you to go quiet to get on with things.
| akoboldfrying wrote:
| > You have no ability to "normalize" anything.
|
| Normalisation isn't something that one person by themselves
| can achieve. It only happens when public opinion is swayed.
| How is it swayed? By people deliberately trying to sway it,
| like GP here.
|
| If you are instead arguing that normalisation is not really
| a thing at all: What do you call the change in attitudes to
| people who are left-handed, disabled, or homosexual?
| throwaway290 wrote:
| > You have no ability to "normalize" anything.
|
| You can if you convince everyone to stop making a fuss
| because it's the new normal. The comment literally said
| "it's the new normal".
| CuriouslyC wrote:
| This is a very bad faith comment from a throwaway account.
|
| Recognition of realities is different from wishing for things
| to occur. If you think you can stop unethical people from AI
| washing your software, feel free to try, you will fail.
| throwaway290 wrote:
| Bad faith = trying to normalize bad faith behavior.
|
| > If you think you can stop unethical people from AI
| washing your software, feel free to try, you will fail.
|
| Posts like these = trying to stop unethical people from
| copyright (copyleft) washing. Telling people writing these
| posts that it's the new normal is basically saying they are
| doing pointless thing, while they are doing something very
| good
| CuriouslyC wrote:
| Stop being a coward and have a discussion with me with
| your real identity.
|
| You can whine about something till you're dead, but
| incentives drive actions, full stop. Instead of whining
| about normalization, lobby lawmakers to make actual
| change and build tools to help creators detect the issue.
| userbinator wrote:
| Hopefully the spread of AI will make more people realise that
| everything is a derivative work. If it wasn't an AI, it was a
| human standing on the shoulders of giants.
| CamperBob2 wrote:
| I'll give you the only upvote you'll probably get for that
| sentiment around here. Enjoy your trip to -4 (Dead)!
| hu3 wrote:
| This. AI is a magnificent way to make the entire world's
| codebase available as a giant, cross-platform, standard
| library.
|
| I welcome AI to copy my crap if that's going to help anyone in
| the future.
| alganet wrote:
| You forgot to mention that if things continue as they are, a
| very small group of people will have complete control over
| this giant library.
| hu3 wrote:
| It's a concern. But there are open source models.
| zdwolfe wrote:
| I find it odd that any LLM could be considered open
| source. Sure the weights are available to download and
| use, but you can't reasonably reconstruct the output
| model as it's impractical for an individual to gather a
| useful dataset or spend $5,000,000+ of GPU time training.
| jsight wrote:
| Distillation can extract the knowledge from an existing
| model into a newly trained one. That doesn't solve the
| cost problem, but costs are steadily coming down.
| goku12 wrote:
| That's still a crude repurposement of an inscrutable
| artifact. Open source requires you to share the source
| data from which that artifact (the model parameters) was
| created.
| alganet wrote:
| No, there aren't.
|
| There is open source training and inference software. And
| there are open weights.
|
| Those things are not enough to reproduce the training.
|
| Even if you had the hardware, you would not be able to
| recreate llama (for example) because you don't know what
| data went into the training.
|
| That's a very weird library. You can get their summaries,
| but you don't have access to the original works used when
| creating it. Sounds terrible, open source or not.
| vineyardmike wrote:
| Open source model, created at great expense... by a still
| small cohort of people.
|
| There are like a dozen organizations globally creating
| anything close to state of the art models. The fact that
| you can use some for free on your own hardware doesn't
| change that those weights were trained by a small cohort
| of people, with training data selected by those people,
| and fine-tuning and "alignment" created by those people.
|
| Sure you can fine-tune the smaller ones yourself, but
| that still leaves you at the will of original creator.
| ares623 wrote:
| Are you able to build these models from source?
| beeflet wrote:
| Except closed source software which it isn't trained on.
| smj-edison wrote:
| Yeah, this is where I find the copyright argument a little
| weak. Because how do artisans learn their craft? By observing
| others' work.
|
| Instead, I feel like the objections are (rightly) these two
| issues:
|
| 1. GenAI operates at a much larger scale than an individual
| artist. I don't think artists would have an issue with someone
| commissioning a portrait say in the style Van Gogh (copyright
| argument). They would have an issue if that artist painted
| 100,000 pictures a day in the style of Van Gogh.
|
| 2. Lack of giving back: some of the greatest artists have
| internalized great art from previous generations, and then
| something miraculous happens. An entirely new style emerges.
| They have now given back to the community that incubated them.
| I don't really see this same giving back with GenAI.
|
| Edit: one other thought. Adobe used their own legally created
| art to train their model, and people still complain about it,
| so I don't buy the copyright argument if they're upset about
| Adobe's GenAI.
|
| Edit 2: I'm not condoning blatant copyright infringement like
| is detailed in this post.
| alganet wrote:
| Copyright is a nightmare. It's just that it sounds like a
| gentler nightmare than hyperscaled algorithms controlled by a
| few.
| charcircuit wrote:
| >Lack of giving back
|
| I disagree. There is a ton of free AI generated text, code,
| images, and video available for completely free for people to
| learn from.
| chrisldgk wrote:
| Which is just laundered from real material that real humans
| put work in to create, only to be regurgitated by a krass
| homonculous of 1s and 0s for free without any mention of
| the real work that has been put into creating that
| information.
|
| I'm not a big fan of the copyright system we have myself,
| but there's a reason it exists. AI companies illegally
| training their AI on copyrighted content to reap the spoils
| of the hard work of other people that never get recognition
| for their work is the opposite of ,,giving back".
| visarga wrote:
| 1. If I wanted the "style of Van Gogh" I would simply
| download Van Gogh, why waste time and money on approximative
| AI. But if I want something Else, then I can use AI. But Gen
| AI is really the worst infringement tool, for example would
| anyone try to read bootleg Harry Potter from a LLM to avoid
| payment? Don't think so.
|
| 2. LLMs will give back what you put in + what they learned,
| it's your job to put in the original parts. But every so
| often this interaction will spark some new ideas. The
| LLM+human team can get where neither of them would get alone,
| building on each other's ideas.
| bluefirebrand wrote:
| > Because how do artisans learn their craft? By observing
| others' work
|
| I don't think that computer systems of any kind should have
| the same right to fair use that humans have
|
| I think humans should get fair use carve outs for fanart and
| derivative work, but AI should not
| add-sub-mul-div wrote:
| Nothing subverts my defense of human creativity more than the
| cliched human defenses of AI.
| monero-xmr wrote:
| For those of us who exceed the AI, it raises our value
| enormously. You see it in the pay of the AI engineers. But in
| the high interest rate world, those of us who continue to be
| employed, are commanding higher wages, as far as I can tell.
| It is a culling of the lesser-than.
|
| One unfortunate side-effect is the junior engineers who
| cannot immediately exceed the AI are not being hired as
| often. But this era echos the dotcom boom, where very low-
| skilled people commanded very-high wages. Universities, which
| have always been white collar job training but pretended they
| weren't, are being impacted greatly.
|
| https://registrar.mit.edu/stats-reports/majors-count
|
| 24% of undergraduate MIT students this year have Computer
| Science in the title (I asked chatgpt to calculate this from
| the difficult-to-parse website). 1/4 of all MIT
| undergraduates are not being trained to be future PhD
| researchers - they, like all other schools, are training the
| vast majority of their students for private sector workforce
| jobs.
|
| The culling is happening all over. We will likely go down to
| < 1000 colleges in America from 4000 now over the next 15
| years.
|
| This is a good thing. The cost of university degrees is far
| too high. We are in the midst of a vast transition. College
| should return to being the purview of the truly intelligent
| and the children of the rich, as it was for all time before
| WW2. This very weird experiment in human history is ending,
| and it cannot happen soon enough
| sciencejerk wrote:
| > _College should return to being the purview of the truly
| intelligent and the children of the rich, as it was for all
| time before WW2._
|
| You're likely correct that we're witnessing a
| reconsolidation of wealth and the extinction of the middle
| class in society, but you seem happy about this? Be careful
| what you wish for...
| monero-xmr wrote:
| Alternatively, _all_ middle class jobs do not require a
| college degree. Perhaps a college degree is primarily a
| signalling mechanism for adherence to a bygone era of
| societal norms. But the price is far too high to justify
| it, and the market will create alternative proof of
| societal norms, at a far cheaper price. Which is
| happening as we debate.
|
| My concern now is a large number of under-employed
| college graduates who are indebted to worthless degrees,
| feeling pinched because the debt far surpasses their
| market value. This has been the case for a long time, but
| has now reached the upper-echelons of academia where even
| Ivy league grads cannot get employment. You need to re-
| calibrate your ire to the correct target
| novemp wrote:
| Yeah, sure, not every job should require a degree, but
| that doesn't justify keeping The Poors from pursuing
| education.
|
| Some of us value education for its own sake, not as a
| prerequisite for employment.
| monero-xmr wrote:
| You are assuming the only avenue to "education" is
| through the university experience
| novemp wrote:
| Some people learn best in structured class settings.
| sciencejerk wrote:
| If AI and other societal shifts eliminate many white-
| collar jobs in developed countries, degree-seekers will
| eventually notice and the demand and perceived value of a
| college education may greatly diminish. I got my degree
| at a time when it was actually useful as a signalling
| mechanism. Now students might not benefit much from a
| college degree and internships might be hard to find.
| This is too bad and grossly unfair.
|
| I hope that new societal avenues are created to help
| young people start their careers, even if those careers
| are in fields like plumbing, nursing and hospitality. I
| also hope efforts are made to help white collar workers
| transition into other (lesser-paying) careers when AI
| really starts to permanently reduce the size of the
| white-collar workforce.
| sciencejerk wrote:
| > _You need to re-calibrate your ire to the correct
| target_
|
| Who do you think is the correct target? Big institutions?
| The college system?
| monero-xmr wrote:
| Definitely the colleges, which charge more year after
| year, burdening the young with debt for worthless pieces
| of paper.
| ares623 wrote:
| They probably think they're one of the "truly intelligent
| and children/parent of the rich" lol
| monero-xmr wrote:
| I would not want the unintelligent and non-rich to go
| into debt to spend 4 years at a university, getting a
| degree in a subject which is absurd
|
| https://www.sps.nyu.edu/explore/degrees-and-programs/bs-
| in-h...
|
| Please, tell me how going $300k in debt for an
| undergraduate degree in Tourism Studies benefits society,
| or the student
| ares623 wrote:
| Sounds like a US problem tbh
| monero-xmr wrote:
| Nearly every European university offers a degree in
| Tourism. The difference is regarding debt. But
| socializing the cost of a _degree in tourism_ does not
| mean the cost isn't born by society. I believe deep in my
| bones that one can learn the ropes of managing a hotel
| outside the illustrious grounds of a university
| ares623 wrote:
| Tbh it just sounds like you don't value others and
| others' work that isn't a "hard" field.
|
| Sure people can learn hotel management outside of
| university. But outside of nepotism who will trust random
| strangers with no qualifications to get their foot in the
| door?
|
| And you make it sound like socializing the cost of
| improving outcomes for the next generations as a
| negative. What is the point if society if not that? Even
| from a purely selfish perspective, The next generation
| will take care of me when I am too old to do it myself.
| I'd want them to be in a good state by then
|
| Do you think you got to wherever you are now without some
| part of socialized cost of society getting you there?
| monero-xmr wrote:
| "Tourism studies" isn't a field. It's not an academic
| discipline. Requiring someone to spend years "studying"
| this is completely absurd. The reality is university is
| finishing school, and young adults desire 4 years of
| screwing around while officially getting a degree, and
| society subsidies it.
|
| You have missed my point entirely. These degrees have no
| value. I would argue they have negative value when
| factoring in their cost in resources and wasted time
| card_zero wrote:
| 35%, ignoring "secondary majors" which may or may not
| coincide with primary majors that also have CS in the
| title.
|
| (Also ignoring the thousand first years at the end of the
| list.)
|
| The various 0.5 half-student quantities throw some doubt on
| the measurement too.
| teiferer wrote:
| > College should return to being the purview of the truly
| intelligent and the children of the rich, as it was for all
| time before WW2.
|
| Yeah, the world was a better place when it was mostly white
| males having that chance.
|
| /s
| as1mov wrote:
| The offending repository is copying files verbatim while
| removing off the license header from the said files. It's not
| "standing on the shoulder of giants".
| typpilol wrote:
| That doesn't even seem like ai but just direct copy pasting
| lol
| em-bee wrote:
| looks to me like they are using AI to refactor the code,
| not to generate it. even if we allow code to be used to
| train AI to generate new code, copying code and refactoring
| it is something entirely different.
| croes wrote:
| AI makes it easy for others to claim they did the work so
| others are less likely to do the real work. Means the giants
| won't grow.
| dvfjsdhgfv wrote:
| This is a very intriguing statement because it looks like it
| contains a truism but something is off. Yes, everything is a
| derivative work of some kind, what matters is the amount of
| added value - if it gets close to 0 as in this case, we've got
| bare plagiarism.
|
| [As a side note, the problem with LLMs (sorry, the term "AI"
| became so muddy I prefer not to use it) is that they tend to be
| extremely uncreative and just average to mean. So I wouldn't
| expect added value in creativity itself, just helping humans
| with more menial tasks just like antirez is doing.]
| ugh123 wrote:
| > Please DO NOT TURST ANY WORD THEY SAY. They're very good at
| lingual manipulation.
|
| I don't know if this was intentional misspelling or not but it's
| damn funny
| josfredo wrote:
| It is likely intentional as the author is battling AI with many
| means possible. However it leans towards funny and hopeless at
| the same time.
| dvrp wrote:
| This is the new reality. Information in the form of raw entropy
| encoded in weights--it doesn't matter if it's text, image, video,
| or 3D. Assets (or formerly known as assets) now belong to the big
| labs, if it's on the internet.
|
| Internet plus AI implies the tragedy of the commons manifested in
| the digital world.
| arthurofbabylon wrote:
| If we step back and examine LLMs more broadly (beyond our
| personal use cases, beyond "economic impact", beyond the
| underlying computer science) what we are largely looking at is an
| emerging means of collaboration. I am not an expert computer
| scientist, and yet I can "collaborate" (I almost feel bad using
| this term) with expert computer scientists when my LLM helps me
| design my particular algorithm. I am not an expert on Indonesian
| surf breaks, yet I tap into an existing knowledge base when I
| query my LLM while planning the trip. I am very naive about a lot
| of things and thankfully there are numerous ways to integrate
| with experts and improve my capacity to engage in whatever I am
| naive about, LLMs offering the latest ground-breaking method.
|
| This is the most appropriate lens through which to assess AI and
| its impact on open source, intellectual property, and other
| proprietary assets. Alongside this new form of collaboration
| comes a restructuring of power. It's not clear to me how our
| various societies will design this restructuring (so far we are
| collectively doing nearly nothing) but the restructuring of these
| power structures is not a technical process; it is cultural and
| political. Engineers will only offer so much help here.
|
| For the most part, it is up to us to collectively orchestrate the
| new power structure, and I am still seeing very little literature
| on the topic. If anyone has a reading list, please share!
| visarga wrote:
| > what we are largely looking at is an emerging means of
| collaboration.
|
| They surpass open source, "out-open source-opensouce" by
| learning skills everywhere and opening them up for anyone who
| needs them later.
| goku12 wrote:
| It's owned by a few rich corporations and individuals. It
| isn't available to anyone - only to those they choose and are
| ready to pay them. And it isn't open source at all, because
| open source is not reuse without any obligations (even under
| permissive licenses). And let's not forget that they 'open'
| only FOSS works and individual works. They never expose
| proprietary IP belonging to rich corporations. It isn't an
| emerging method of collaboration - it's another method for
| wealth consolidation.
| 48terry wrote:
| Me copying and pasting your post verbatim to put on my blog
| under my name: "I greatly enjoyed collaborating with
| arthurofbabylon on this piece."
| foxylad wrote:
| This will kill open source. Anything of value will be derived and
| re-derived and re-re-derived by bad players until no-one knows
| which package or library to trust.
|
| The fatal flaw of the open internet is that bad players can
| exploit with impunity. It happened with email, it happened with
| websites, it happened with search, and now it's happening with
| code. Greedy people spoil good things.
| awesome_dude wrote:
| If this was true, why hasn't it happened for the last... 30 or
| 40 years that FOSS code has been published on the internet
| ares623 wrote:
| Last i checked LLMs didn't exist until only a few years ago
| makeitdouble wrote:
| Copyright was the base protection layer. Not in the "I own
| it" sense, but in the "you can't take it and run with it"
| sense.
|
| With the current weakening of it, it opens the door to abuses
| that we don't have the proper tools to deal with now. Perhaps
| new ones will emerge, but we'll have to see.
| croes wrote:
| Same reason why fake images and videos are now more.
| Photoshop existed 30 years ago.
|
| Before LLM you needed time and abilities to do it, with AI
| you need less of both.
| trod1234 wrote:
| Until now, people have had the leverage/cost asymmetry in
| their favor where they could easily differentiate and make
| rational choices.
|
| AI has tipped that nuanced balance in a way that is both
| destructive, and unsustainable. Just like any other fraud or
| ponzi.
|
| Cost/loss constraint function now favors the unskilled,
| blind, destructive individual running an LLM who spits on all
| those that act with good faith. Quite twisted.
| billy99k wrote:
| This was always the case with open source. It's not that hard
| to obfuscate code in compiled binaries.
| cientifico wrote:
| The license was MIT until two months ago.
|
| That gives anyone the right to get the source code of that commit
| and do whatever.
|
| The article does not specified if the company is still using the
| code AFTER the license change.
|
| The rest of the points are still valid.
| laurex wrote:
| I'm interested in a new kind of license which I'm calling
| "relational source" - not about money or whether a product is
| commercial but instead if there's an actual person who wants to
| use the code with some kind of AGPL-esque mechanism to ensure no
| mindless ingestion- perhaps this would never work but it's also
| breaking the spirit of everything I love about OSS to have AI
| erasing the contributions of the people who put their time into
| doing the work.
| haebom wrote:
| Wouldn't "Pretending It's Mine" be a better name for the project?
| pjfin123 wrote:
| Is the allegation here that a LLM generated code that was very
| similar to the author's copyright protected code or that they
| copied the code and then tried to use AI to hide that fact?
___________________________________________________________________
(page generated 2025-10-05 23:01 UTC)