Posts by rrwo@fosstodon.org
(DIR) Post #AQL2T9lkbfUTZYwwnA by rrwo@fosstodon.org
2022-12-06T18:33:53Z
0 likes, 0 repeats
@thenewoil the article says PRoot is used to deploy tools on systems that are already compromised. It's not the service used to compromise systems.
(DIR) Post #ARpeuWcrw11SPqG1OC by rrwo@fosstodon.org
2023-01-20T09:36:33Z
0 likes, 0 repeats
I don't like the semantics of Favouriting a post. It's as bad as Facebook likes or Twitter hearts.Sometimes you just want to ping it as a way of acknowledging it, providing support to the poster, etc. But that doesn't mean you "like" the post.(The "+1" that Google+ had for reacting to posts was probably the only thing Google got right.)
(DIR) Post #AVpSFmFKUQy3b3ogIy by rrwo@fosstodon.org
2023-05-18T08:11:49Z
0 likes, 0 repeats
At work, we've decided to block the Common Crawl bot from our websites, because their index is used to train generative #AI systems.We've also blocked or severely limited requests from IP ranges associated with various cloud providers, because they are usually from unidentified bots.1/n
(DIR) Post #AVpSFnLOPNPb09f2OW by rrwo@fosstodon.org
2023-05-18T08:16:39Z
0 likes, 0 repeats
This is a crap solution.It means we are excluding ourselves from open web indexes because those indexes are being abused.It means we are excluding or limiting independent search engines that use cloud services, because other users of those services are running bad bots.2/n
(DIR) Post #AVpSFo57fLzXHz3Yfo by rrwo@fosstodon.org
2023-05-18T08:21:42Z
0 likes, 0 repeats
This also makes it harder to create other open indexes of the web.Why should we let you index our site, if your index might be abused?This also makes it harder to compete with Google.How do we know you won't turn around and go from search to generative AI?3/
(DIR) Post #AVpSFooqvKZTZoS4x6 by rrwo@fosstodon.org
2023-05-18T08:27:07Z
0 likes, 0 repeats
There are already so many bad robots that don't respect rate limits, don't properly handle errors (404, 410, 400), don't respect robots.txt, don't identify themselves, are used for fake phishing/spam clone sites, or dodgy SEO ranking.Now throw in bots that misuse content for generative AI.4/
(DIR) Post #AVpSFpUgPo21fY1U9Y by rrwo@fosstodon.org
2023-05-18T08:39:41Z
0 likes, 1 repeats
And of course, now that Bing and Google have generative AIs, how do we differentiate between them indexing our websites for search vs indexing our websites to train their AIs?5/
(DIR) Post #AVpSFrWAtEPfwkjhku by rrwo@fosstodon.org
2023-05-18T08:43:03Z
0 likes, 0 repeats
There's also the free newspaper problem.Notice how higher quality newspapers and scientific publications are often behind paywalls, and misinformation/conspiracy theory sites are not?Now consider how indexing for generative AI will work with this. Higher-quality information will block AI training because they don't want their content stolen, but propaganda outlets will allow their sites to be used for training because they want their content spread.6/
(DIR) Post #AVpSFtP9sHzNnZT864 by rrwo@fosstodon.org
2023-05-18T09:53:51Z
0 likes, 0 repeats
As an aside, there's a "Have I Been Trained" image search website to search for photos that have been used for AI training data. https://haveibeentrained.com7/
(DIR) Post #AW0bL7N7KlNGie1BUO by rrwo@fosstodon.org
2023-05-25T09:07:55Z
0 likes, 0 repeats
@EU_Commission How will this square with attempts to ban end-to-end encryption to make it easier to scan messages for illegal content?
(DIR) Post #AW6sXvoQUHbGnlFNBY by rrwo@fosstodon.org
2023-05-28T09:49:26Z
0 likes, 0 repeats
@khoji Part of the problem is that the UI on devices seems to change every few years.Users learn where something on their phone is, and then the vendor arbitrary changes icons and reorganises menus.It's like the Wizard (from The Wiz) deciding everyone should wear a new colour every few minutes.There's a reason people hate upgrading devices.
(DIR) Post #AWFbrCS6QCmm05xT4S by rrwo@fosstodon.org
2023-06-01T11:12:20Z
1 likes, 0 repeats
Here is a gentle reminder that our views of ownership of creative works evolves based on technology.Copyright was never an issue until printing and literacy were widespread.It makes sense that people are re-evaluating fair use now that works can be used for training generative software.