fsebugoutzone.org:9999

       Post 9wgWIeFYBoZYsuGLtQ by christianbundy@social.coop
 (DIR) More posts by christianbundy@social.coop
 (DIR) Post #9wg3ZtKOU9XVtuOkbo by sir@cmpwn.com
       2020-07-02T13:47:20Z
       
       3 likes, 4 repeats
       
       We really need a FOSS search engine with, and this is important: its own in-house, FOSS crawler
       
 (DIR) Post #9wg3cJ4ziELgLaLZQm by penguin42@mastodon.org.uk
       2020-07-02T13:52:43Z
       
       0 likes, 1 repeats
       
       @sir Where do you store the crawlers data?
       
 (DIR) Post #9wg3h0Gl4lfLSL5ZfU by fakefred@mastodon.technology
       2020-07-02T13:53:35Z
       
       0 likes, 0 repeats
       
       @penguin42 @sir ... On federated servers?
       
 (DIR) Post #9wg4CZpUuhGkinOS12 by OTheB@mastodon.technology
       2020-07-02T13:52:14Z
       
       1 likes, 0 repeats
       
       @sir &quot;SearchHut&quot;?
       
 (DIR) Post #9wg4Vdq4EQ34eqkb0S by amael@social.linux.pizza
       2020-07-02T14:02:36Z
       
       0 likes, 1 repeats
       
       I 100% agree with you, I would be glad to work on it !
       
 (DIR) Post #9wg4X1ZjSu58Z5llU8 by simon@fosstodon.org
       2020-07-02T13:54:18Z
       
       0 likes, 0 repeats
       
       @sir a search engine that searches only (independent) blogs would be great too
       
 (DIR) Post #9wg4f8uM5QS5ciHpxo by sir@cmpwn.com
       2020-07-02T13:54:34Z
       
       0 likes, 0 repeats
       
       @fakefred @penguin42 federating a search engine would be pretty difficult, but I would be interested in seeing some research around community ownership of the crawled data.
       
 (DIR) Post #9wg4yyrj8LC6GkW6dM by aktivismoEstasMiaLuo@activism.openworlds.info
       2020-07-02T13:54:50Z
       
       0 likes, 0 repeats
       
       @sir #searx is a FOSS search engine &amp; #yacy is a FOSS crawler, and they work together.
       
 (DIR) Post #9wg5Cyy7Xy3idwBT6m by amk@mastodon.amk.ie
       2020-07-02T13:55:25Z
       
       0 likes, 0 repeats
       
       @sir lets hope spider/ask.moe will free us from this search engine prison. I&#39;ve been using qwant which seem to make similar promises to ddg, but its also not FOSS which is a shame.
       
 (DIR) Post #9wg5rOaLiy47fuKrya by selea@social.linux.pizza
       2020-07-02T14:17:48Z
       
       0 likes, 1 repeats
       
       @aktivismoEstasMiaLuo How do they work toghether?@sir
       
 (DIR) Post #9wg61TBTTsX2pkVKTY by aktivismoEstasMiaLuo@activism.openworlds.info
       2020-07-02T14:19:36Z
       
       0 likes, 0 repeats
       
       @selea @sir i just know that I&#39;ve encountered some searx instances that source indexes from a yacy instance running on the same host.  I&#39;ve not installed it myself.
       
 (DIR) Post #9wg6DzstjALtj8mxtY by selea@social.linux.pizza
       2020-07-02T14:21:53Z
       
       0 likes, 1 repeats
       
       @aktivismoEstasMiaLuo Oh, I did not know that! Thank you very much!@sir
       
 (DIR) Post #9wg7K3Yx43qapJucgi by penguin42@mastodon.org.uk
       2020-07-02T14:34:15Z
       
       0 likes, 1 repeats
       
       @sir @fakefred I think that storage and access is the challenge - and trust if it&#39;s federated, you don&#39;t want all searches to get redirected to porn sites or other vendors sites.
       
 (DIR) Post #9wgSX1BpIRrOGk7LZA by flewkey@layer8.space
       2020-07-02T18:31:54Z
       
       0 likes, 1 repeats
       
       @sir The Gigablast search engine published their source code to a git repository a while back, but it definitely needs an overhaul.
       
 (DIR) Post #9wgWIeFYBoZYsuGLtQ by christianbundy@social.coop
       2020-07-02T19:12:30Z
       
       0 likes, 0 repeats
       
       @sir I was literally just working on this! My use-case is that I&#39;ve contributed lots on GitHub and I want to download all of the repos I&#39;ve worked on... but I can&#39;t get a list of them.Currently fighting with their GraphQL API, but I&#39;d kill for a &quot;give me a list of all repos where a commit is authored by me&quot; search query.
       
 (DIR) Post #9wgWS1j4H8P6IhXd44 by sir@cmpwn.com
       2020-07-02T19:14:28Z
       
       0 likes, 0 repeats
       
       @christianbundy that&#39;s not what I meant. I meant a FOSS search engine for searching the web at large
       
 (DIR) Post #9wgX8c29fqJV4Om9L6 by christianbundy@social.coop
       2020-07-02T19:22:07Z
       
       0 likes, 0 repeats
       
       @sir oh! I haven&#39;t looked into those in a while, last I saw I think YaCy was state-of-the-art. If you find anything (or build anything) I&#39;d be happy to test.
       
 (DIR) Post #9wgazJWJXMkErThhSq by _1751015@mastodon.host
       2020-07-02T20:06:34Z
       
       1 likes, 1 repeats
       
       @sir 1) https://yacy.net/ - implementation of P2P (peer-to-peer) search engine2) https://commoncrawl.org/2020/06/may-june-2020-crawl-archive-now-available/ - they provide public index and code: https://github.com/commoncrawl
       
 (DIR) Post #9wgbHO9N3tTGQwahpw by sir@cmpwn.com
       2020-07-02T20:07:35Z
       
       0 likes, 0 repeats
       
       @_1751015 where can I play with a search engine powered by this data?
       
 (DIR) Post #9wgnW4RVoZ8f6cqNcm by cuniculus@cmpwn.com
       2020-07-02T22:25:10Z
       
       0 likes, 0 repeats
       
       @sir @_1751015 https://yacy.eric.ovh/
       
 (DIR) Post #9wgng1b5FWPDAz4Ehl by sir@cmpwn.com
       2020-07-02T22:26:36Z
       
       0 likes, 0 repeats
       
       @cuniculus @_1751015 ooof the animations and javascript yikes
       
 (DIR) Post #9wgnmqoVjxU9l7bQrQ by sir@cmpwn.com
       2020-07-02T22:27:10Z
       
       0 likes, 0 repeats
       
       @cuniculus @_1751015 tbh I don&#39;t think a distributed search engine is the right approach
       
 (DIR) Post #9wgrW5ljHf3T2FZ9V2 by cuniculus@cmpwn.com
       2020-07-02T23:10:47Z
       
       0 likes, 0 repeats
       
       @sir @_1751015 Yeah, since it requires loads of storage and fat bandwidth
       
 (DIR) Post #9wh38Ibj665JLzJAnY by thatkiwiguy@coffeehouse.institute
       2020-07-03T01:18:18Z
       
       0 likes, 0 repeats
       
       @sir would https://yacy.net be appropriate? Self hosted, DHT, P2P...
       
 (DIR) Post #9wiGSqCDQ1TvseSdyy by katie@mstdn.io
       2020-07-03T15:26:08Z
       
       0 likes, 0 repeats
       
       @aktivismoEstasMiaLuo @selea @sir https://yacy.everdot.org/ defaults to only sourcing the global + a private yacy network and https://searx.everdot.org/ includes a private yacy network by default.One major problem with using the global yacy network is that you have to decide a cut-off for how long you want to wait for global results and drop slower servers because some use minutes before they respond. That&#39;s just too slow. Also, patch is needed to sort results, default is first come first shown.
       
 (DIR) Post #9wiJKVtJKNsGS4L5v6 by katie@mstdn.io
       2020-07-03T15:58:14Z
       
       0 likes, 0 repeats
       
       @sir Even if you have a FOSS search engine with a FOSS crawler like what&#39;s running on https://yacy.everdot.org/ you&#39;ll quickly run into performance issues and economic issues. Going FOSS won&#39;t automatically bring in advertisement revenue and that&#39;s what Google/Bing/etc actually do, they are advertisement agencies not search engines. That&#39;s how they afford thousands of servers. There&#39;s free software but there&#39;s no such thing as free hardware.
       
 (DIR) Post #9wmMHtJoqSeCjl1G1w by _1751015@mastodon.host
       2020-07-05T14:50:11Z
       
       0 likes, 1 repeats
       
       @sir I don&#39;t have information about a search engine using the Common Crawl data. They have a compiled list with references to various small projects that use the data:https://commoncrawl.org/the-data/examples/
       
 (DIR) Post #9wmMVO4rnP19tlSdQe by _1751015@mastodon.host
       2020-07-05T14:52:38Z
       
       0 likes, 1 repeats
       
       @cuniculus @sir YaCy has some niche applications that are interesting. Check the writing here and the comments:https://www.susa.net/wordpress/2020/05/personal-search-engine/Personal index of curated URLs + eventually sharing the index - IMO it has advantages over a general purpose search engine.
       
 (DIR) Post #9wopbXXOp0y1xm0qUC by z428@social.tchncs.de
       2020-07-06T19:26:16Z
       
       0 likes, 0 repeats
       
       @sir Agree. But, once and again: Maybe this is not so much a F(L)OSS issue but more an issue of handling a large, potentially decentralized / distributed search index at runtime, keeping things available, stable, performant 24x7. Maybe, finally, a situation to understand our current focus on code and code licensing is important but not *all* it takes to have working technology available.....? 🙂