Post ASbsFlcD9ZYH7PEUT2 by Structure7780@mat.random101.net
 (DIR) More posts by Structure7780@mat.random101.net
 (DIR) Post #ASbrnK6hpno5h9vw5w by lauren@mastodon.laurenweinstein.org
       2023-02-12T17:08:24Z
       
       0 likes, 0 repeats
       
       I am beginning work on a broad proposal to provide content creators, websites, and associated creator stakeholders, methodologies for controlling if and how generative AI systems should be permitted to use these entities' content/data for training and other purposes. This will likely involve a mix of technical and legislative aspects.
       
 (DIR) Post #ASbsFlcD9ZYH7PEUT2 by Structure7780@mat.random101.net
       2023-02-12T17:13:27Z
       
       0 likes, 0 repeats
       
       @lauren I hope you are successful with this if we ignore this humongous issue its going to hurt everyone from content creators to those generating the #AI systems but if we come up with a way of getting permission train these #AI systems that would be great for everyone.
       
 (DIR) Post #ASbsJTpdDKwt7dHSJk by Structure7780@mat.random101.net
       2023-02-12T17:14:06Z
       
       0 likes, 0 repeats
       
       @lauren also I forgot is there anyway I can help with this project
       
 (DIR) Post #ASbt2rP0QTgFFwSvUu by jarocats@mastodon.lol
       2023-02-12T17:22:18Z
       
       0 likes, 0 repeats
       
       @lauren Count me in, without question. I'm deep into a project on just this issue.I'm all ears!
       
 (DIR) Post #ASbtxLYJW1x8yvkoCm by CarlG314@universeodon.com
       2023-02-12T17:32:20Z
       
       0 likes, 0 repeats
       
       @lauren I'm not a technical guy, so forgive what may be a novice question, but what technical means can allow a website to deliver readable text but prevent a determined AI from scraping its content?
       
 (DIR) Post #ASbuCaf3cjqkBVB2Po by lauren@mastodon.laurenweinstein.org
       2023-02-12T17:35:17Z
       
       0 likes, 0 repeats
       
       @CarlG314 As I said, a mix of technical and legislative measures. For example, the Robots Exclusion Protocol provides the means to specify which content can be spidered/indexed by search engines. While not all entities abide by this, the majors generally do. If backed up by force of law, such mechanisms can become even more effective. Again, that's just an example.
       
 (DIR) Post #ASbulcKxNwKrW8TedM by CarlG314@universeodon.com
       2023-02-12T17:41:30Z
       
       0 likes, 0 repeats
       
       @lauren As the saying goes, those measures would keep an honest thief out.While reputable tech firms would be bound by them, the same entities that ignore the law and wage a daily battle against email spam detection systems would not feel that constrained by either legislation or exclusion protocols.Maybe that would be enough in practice, though, as long as people continue to use Google (or Bing?) for their search or AI needs, and don't sign on with a shadowy Russian hacker's site.
       
 (DIR) Post #ASbuqX84AwFYoTnAO0 by lauren@mastodon.laurenweinstein.org
       2023-02-12T17:42:31Z
       
       0 likes, 0 repeats
       
       @CarlG314 If the majors comply the vast bulk of interactions are covered.
       
 (DIR) Post #ASc04qfGq52PE6y6C0 by dinosaur@seo.chat
       2023-02-12T18:41:07Z
       
       0 likes, 0 repeats
       
       @lauren CC BY SA? :)
       
 (DIR) Post #ASc06z0QnUOdfxsrVA by lauren@mastodon.laurenweinstein.org
       2023-02-12T18:41:38Z
       
       0 likes, 0 repeats
       
       @dinosaur ?
       
 (DIR) Post #ASc0UeYWTwKfCW7IoK by dinosaur@seo.chat
       2023-02-12T18:45:47Z
       
       0 likes, 0 repeats
       
       @lauren If you want someone to behave nicely, set the rules (in server headers and meta tags).https://creativecommons.org/licenses/by-sa/4.0/
       
 (DIR) Post #ASc0g7IpwRFTXlFJZY by lauren@mastodon.laurenweinstein.org
       2023-02-12T18:47:53Z
       
       0 likes, 0 repeats
       
       @dinosaur It's gonna take a lot more than that, though as I've noted starting with robots.txt makes sense. But even that is far from trivial.
       
 (DIR) Post #ASc1hJdXfZom2Z4Ang by dinosaur@seo.chat
       2023-02-12T18:59:17Z
       
       0 likes, 0 repeats
       
       @lauren I would prefer using a http link header on every page, makes sure the maschine received it at crawl time.https://www.otsukare.info/2011/07/12/using-http-link-header-for-cc-licenses
       
 (DIR) Post #ASc1oE1QwXEsebXISe by lauren@mastodon.laurenweinstein.org
       2023-02-12T19:00:31Z
       
       0 likes, 0 repeats
       
       @dinosaur Much harder for many sites to set up, especially older ones. Defining new robots.txt stanzas is difficult enough. But again, these are first steps among hundreds that will probably be required.