Post B1AjdlCl6nQAVOVE6i by chpietsch@fedifreu.de
 (DIR) More posts by chpietsch@fedifreu.de
 (DIR) Post #B1AjdihUQ5DijprE5g by lostgen@det.social
       2025-12-12T05:19:44Z
       
       0 likes, 0 repeats
       
       If you own a website that gets scraped by AI-Bots, you also have the opportunity to define their training set. Why not serve them random Wikipedia pages instead of your own content? Keeps your content out and pushes their training objective towards science.#llm #ai #aislop
       
 (DIR) Post #B1AjdkC0s5ELMn97J2 by clock@f.cz
       2025-12-12T06:04:29Z
       
       0 likes, 0 repeats
       
       @lostgen I suggest serve them poison that poisons their neural network - a repetitive automatically generated text full of typos.
       
 (DIR) Post #B1AjdlCl6nQAVOVE6i by chpietsch@fedifreu.de
       2025-12-12T10:36:21Z
       
       0 likes, 1 repeats
       
       @clock @lostgen If AI crawlers/scrapers don't obey my robots.txt (most of them don't), then I feel it is my moral duty to feed them poison instead of quality content they already have. (Wikipedia is one of the first sources LLMs are trained on.) And it is important to feed them very slowly (#tarpitting). There are tools like #nepenthes that do just that: https://www.heise.de/en/news/Nepenthes-a-tarpit-for-AI-web-crawlers-10256257.html