Post AsqeDAIO924xFwwohU by harry_wood@en.osm.town
(DIR) More posts by harry_wood@en.osm.town
(DIR) Post #AsqeD8oDfiLue5pD2O by harry_wood@en.osm.town
2025-03-20T11:41:52Z
0 likes, 0 repeats
Vicious criticism of LLMs from this sysadmin who has to deal with their scrapers: https://drewdevault.com/2025/03/17/2025-03-17-Stop-externalizing-your-costs-on-me.html The LLM scraper problem seems surprising to me. The makers of big new generative AI systems are mostly big-tech firms. Don't they value their reputation, or even the reputation of the AI concept overall, better than to commission these cowboys to do their scraping? But maybe they've already decided that, due to the copyrights risk, it's best done at arms length via shady intermediaries.
(DIR) Post #AsqeD9aQmSuv3cNiBU by russss@chaos.social
2025-03-20T11:45:11Z
0 likes, 0 repeats
@harry_wood I don't think it's the big AI companies which are scraping excessively, it's random people who probably got a LLM to write their scraper bots... To make it more confusing, in some cases they're cloning the user-agents of the major AI bots.
(DIR) Post #AsqeDAIO924xFwwohU by harry_wood@en.osm.town
2025-03-20T11:57:46Z
0 likes, 0 repeats
@russss Are you saying the data isn't even necessarily being used for training LLMs? The problem is just correlated to rise of LLMs because LLMs are making it a lot easier to write scrapers (and I guess chatGPT will also happily advise on how to bypass mitigations)
(DIR) Post #AsqeDBImP3zCNS8dwu by russss@chaos.social
2025-03-20T12:02:47Z
1 likes, 0 repeats
@harry_wood oh it's probably LLMs or something LLM-adjacent. But I doubt it's the big AI players who are responsible for the excessive scraping.
(DIR) Post #AsqeDGn5zmCtOvMQOe by harry_wood@en.osm.town
2025-03-26T14:09:38Z
2 likes, 1 repeats
More on the LLM scraper problem https://go-to-hellman.blogspot.com/2025/03/ai-bots-are-destroying-open-access.html by @gluejar "Thousands of developer hours are being spent on defense against the dark bots and those hours are lost to us forever. We'll never see the wonderful projects and features they would have come up with in that time"