Post AWiA7Ks8szXwTwsovQ by AmericanChampion@poa.st
(DIR) More posts by AmericanChampion@poa.st
(DIR) Post #AWi6hRneCvAwZkkEXw by Aether@poa.st
2023-06-15T08:52:23.125404Z
8 likes, 5 repeats
Google is getting a lot worse because of the Reddit blackouts. theverge.com/2023/6/13/23759942/google-reddit-subreddit-blackout-protestsIt says a lot about the state of the internet when Reddit is considered one of the more valuable resources.
(DIR) Post #AWiA7Ks8szXwTwsovQ by AmericanChampion@poa.st
2023-06-15T09:30:40.821672Z
2 likes, 0 repeats
@Aether I've talked to several very skilled CS/AI people in my lifetime, and they all agree that search engines died around 2018. I don't know if they all knew why, but we know why - there was a huge push to get rid of the old algorithms that optimized, quite spectacularly, towards giving users what they wanted. People have talked about using LLMs instead. They can certainly do a few of the things that Old Google could, but there are some things that they can't manage, and probably won't ever manage. They can walk you through the trivial stuff, but falter whenever they face something that the meta-learning inherent to their training process isn't capable of grasping with limited data.Some people have tried to make their own, like Brave Search. I get the feeling that these are half-assed efforts, without a real team or any real resources behind them, but even so, I think you had to have scale to get the old algorithms to work. Might be like peak oil - no way to get back once the initial route has been run.
(DIR) Post #AWiAMRCjcqAoSriTyK by charliebrownau@poa.st
2023-06-15T09:33:24.586838Z
3 likes, 0 repeats
@AmericanChampion @Aether I recommend using SearX or SearXNGgithub.com/searx/searxdocs.searxng.org/
(DIR) Post #AWiBPsvL7bS3G2mtFY by AmericanChampion@poa.st
2023-06-15T09:45:13.821096Z
0 likes, 0 repeats
@charliebrownau @Aether I'll have a look. There are a number of pro-privacy search engines (Brave Search meets that criteria), but the main issue I've had is that their results are even worse than Google's, since they don't have the data to train a strong search algorithm, and they don't have the resources to manually patch their system to prune the SEO'd spam sites that will otherwise accumulate in the results of every query. Looking at their github repo, they seem to work by aggregating search results from multiple services. I couldn't find anything related to machine learning, indicating that they use some kind of heuristic that doesn't get tuned by user data. Will try it out for a few days and see if results are meaningfully better than Brave Search.
(DIR) Post #AWiIfL9h8gQKq5opOa by kamehamic@poa.st
2023-06-15T11:06:28.087780Z
1 likes, 0 repeats
@AmericanChampion @charliebrownau @Aether I really don't get it, it should be trivial to make a better search engine than google's, just pull from the archive sites (the archive sites could make money of it), pull from the dmca lists that aren't from your own search engine, pull from other search engines like baidu... I understand not being able because not having the resources... but then WTF are you doing.Searx sucks, it gives WAY less results than google, start searching chinese stuff or anything non-english, it's empty.Also google it's so bad that sometimes it's better to use youtube to search videos and sometimes google is a better option to do the same, because google has date filters while youtube doesn't.
(DIR) Post #AWiJScoQHPEJn5MzWC by charliebrownau@poa.st
2023-06-15T11:15:22.373647Z
0 likes, 0 repeats
@kamehamic @AmericanChampion @Aether Keep using jewogle then
(DIR) Post #AWiKS35RCrUGwacLiq by AmericanChampion@poa.st
2023-06-15T11:26:28.503103Z
0 likes, 0 repeats
@kamehamic @charliebrownau @Aether The core issue for this, along with a lot of other things, is scale. When the algorithms that made good search engines possible came into being circa 2010, Google already had a monopoly, and could use hundreds of millions of users' data for A-B testing and training models of which search results were relevant to which users and which searches. You need a critical mass of training data before a large-scale machine learning model becomes useful, and you need to be the best in the business already to get the kind of usage required to train such a model. My fear is that this situation can't be replicated, and that those five or so years in which Google worked will never happen again.
(DIR) Post #AWiKbkwNWnwFYp0cBE by charliebrownau@poa.st
2023-06-15T11:28:13.799701Z
1 likes, 0 repeats
@AmericanChampion @kamehamic @Aether Considering the direction the Internet and realm is headedWe may not even have access to the Internet after 2030
(DIR) Post #AWiKshGuQEFj6YgpxQ by AmericanChampion@poa.st
2023-06-15T11:31:17.230588Z
0 likes, 0 repeats
@charliebrownau @kamehamic @Aether Internet itself is robust. You will always be able to achieve a large network of computers. You could have a solar flare and complete collapse tomorrow, and by next week some tech-savvy rednecks would've strung together an ad-hoc network spanning the entire West Coast. A month later, they'd link up with the Japanese, and we'd have the whole world talking again within a year.The nature of CS is that some things are relatively easy and can be recreated by a few kids with $100 once the tech is known, and other things are relatively hard and even massive corporations can't recreate them if they haven't hired one of the three or four people capable of understanding them, and it's very difficult to tell which is which at first glance.
(DIR) Post #AWiLJQXsiZOD32NHxg by kamehamic@poa.st
2023-06-15T11:36:07.400829Z
1 likes, 0 repeats
@AmericanChampion @charliebrownau @Aether The only obstacles IMO that don't permit it to happen are legislation that didn't exist back then but now must be taken into account, monopolies wielding power against any kind of opposition and resources. On the resources side I'm not really convinced because 201x computers are magnitudes slower than 202x computers, same can be done with way less now.
(DIR) Post #AWiLfGS0vjDe6w2inY by charliebrownau@poa.st
2023-06-15T11:40:04.215402Z
0 likes, 0 repeats
@kamehamic @AmericanChampion @Aether Nations and Society and the realm would be better off without((( National Government )))((( Judaeo Islam and Judaeo Christianity )))((( Central Banks )))((( Taxation and Usury )))and ((( Police)))
(DIR) Post #AWiLjb87zG9BP9o8eG by AmericanChampion@poa.st
2023-06-15T11:40:51.170298Z
0 likes, 0 repeats
@kamehamic @charliebrownau @Aether Speed isn't data. You absolutely need data to train a useful model, and reinforcement learning (I think that's what most of the engagement algos of the day used) requires vast amounts of data. More compute power alone just lets you overfit faster.
(DIR) Post #AWiOikSa7483uif3iK by billiam@shitposter.club
2023-06-15T12:14:17.890256Z
1 likes, 0 repeats
@AmericanChampion @Aether @charliebrownau yes, searx merely results from existing search engines. It does not resolve the issue of poor results.I find yandex to be quite good for some topics. Brave Search also isn't too bad.
(DIR) Post #AWiUUlUA6RSilV7FOS by robbiedv@poa.st
2023-06-15T13:19:00.950678Z
1 likes, 0 repeats
@billiam @AmericanChampion @Aether @charliebrownau brave search is a pretty good daily driver, but I really hate how they put those reddit threads right at the top
(DIR) Post #AWjaCbkoL6HaFoaYK0 by wrongthink@cdrom.tokyo
2023-06-16T01:58:23.432324Z
2 likes, 1 repeats
@AmericanChampion @Aether I’ve talked to several very skilled CS/AI people in my lifetime, and they all agree that search engines died around 2018. My experiences agree with this approximate timeframe. I think the last time I didn’t have to fight with a search query in order to get the results I was looking even predated then.
(DIR) Post #AWkHbGDVqYlYj710To by Zerglingman@freespeechextremist.com
2023-06-16T10:03:57.222387Z
0 likes, 0 repeats
@Aether betterthanexpected