[HN Gopher] YaCy, a distributed Web Search Engine, based on a pe...
___________________________________________________________________
YaCy, a distributed Web Search Engine, based on a peer-to-peer
network
Author : Timothee
Score : 248 points
Date : 2024-03-06 06:33 UTC (16 hours ago)
(HTM) web link (yacy.net)
(TXT) w3m dump (yacy.net)
| DrDroop wrote:
| I once went to a workshop on a Sunday morning at the local
| makerspace to listen to someone talk about some kind of
| distributed search engine or something like that. One of the
| developers came from (I think) Germany to explain this to us the
| centralized sheeple. He just gave a demonstration of the thing,
| like here is the box you type stuff and here are the results.
| When I started to ask questions about how it worked an all he
| sort of acted annoyed saying it was all too difficult to explain.
| This was more than ten years ago, and yes I am still angry about
| it.
| ssijak wrote:
| At the core it was probably based on peer to peer distributed
| hash tables, so here you go, read the source
| https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia...
| belter wrote:
| 160 bits ought to be enough for anybody :-)
| albert180 wrote:
| It's probably him YaCy is made by a German Dude
| synctext wrote:
| Impressive 20 year project by one key developer.
|
| See 20 year post in German by YaCy founder:
| https://community.searchlab.eu/t/yacy-vor-20-jahren/1543
| ssijak wrote:
| Long time ago I worked for a startup called Wowd which built
| distributed search engine. It was acquihired by Facebook.
|
| On of the biggest issues was how to entice people to download and
| run the client/node.
|
| I half wondered afterwards if slapping some crypto on top of it
| which would be mined by running the node and providing resources
| would help. My gut says easy yes, but my mind grimace at the
| abomination.
| zoklet-enjoyer wrote:
| We have proof of stake now. The nodes could be run by the chain
| validators and they get a cut of the staking rewards. Look up
| how proof of stake works on the Cosmos chain. You could totally
| do this and I bet it would take off, at least in that section
| of the Internet that's into Cosmos/Tendermint chains. I'd use
| it
| ssijak wrote:
| I was definitely thinking of some kind of proof of stake, not
| proof of work.
| zoklet-enjoyer wrote:
| Hahaha one downvote. I love to see it
| worksonmine wrote:
| > but my mind grimace at the abomination
|
| Why would that be an abomination? It's a perfect use-case. Like
| you noticed people need incentives to volunteer their hardware.
| If you hate crypto because it's crypto you can just use fiat
| instead.
| komali2 wrote:
| > Like you noticed people need incentives to volunteer their
| hardware.
|
| I wonder if this is because "volunteer your hardware"
| projects sometimes involve someone making else money, and if
| someone else is making money but not you, why should you
| donate your hardware?
|
| For the truly libre "hardware donation" projects, they seem
| to be doing ok without financial incentivization. What
| immediately comes to mind is the petabytes of data flying
| around on peer to peer systems through torrenting. I know
| people that spend thousands of dollars a year on upkeep and
| upgrades for what are essentially super seedbox homelabs (I'm
| one of them too :P )
|
| There's also communities like soulseek where people keep TBs
| of music up, often seeking out rare tracks to make available
| to the community for free.
|
| There's folding@home and seti@home, and I'm sure other
| similar projects I haven't heard of, where people donate
| cycles just for the common good.
|
| folding@home is a great example because we can directly
| compare the people that are "incentivized" to participate
| with bananocoin, a cryptocurrency rewarded based on work
| cycles in folding@home. You can see all bananocoin miners
| here under the banano.cc team:
| https://stats.foldingathome.org/ That team is in first place
| for work completed, however are only just surpassing the
| linus tech tips team, and not to mention compared to a bunch
| of other teams (and private "donors") they're a very small %
| of work completed for folding@home
|
| So therefore I disagree that people "need" incentives, there
| just needs to be no, erm, disincentives, if that's a word.
| shinryuu wrote:
| > I know people that spend thousands of dollars a year on
| upkeep and upgrades for what are essentially super seedbox
| homelabs
|
| And then you end with "there just needs to be no
| disincentives". If anything spending thousands of dollars a
| year on upkeep should be a disincentive for most people.
| You are not most people though, since you do it
| voluntarily.
| komali2 wrote:
| I'm a maniac though. I used to run my stack just fine off
| a raspberri pi with a USB harddrive plugged in.
|
| Actually, before that, I used to run it off an old
| macbook.
|
| Do we need it to be where everyone hosts a node? I just
| had this conversation with a friend yesterday actually.
| We were in disagreement about the accessibility of self
| hosting and federation. He was of the opinion that we
| should push LLMs to where anyone can type "I want to host
| a video hosting platform" and chatgpt.exe will find and
| install jellyfin on their computer and set up a
| cloudflare tunnel, or whatever.
|
| I'm more of the opinion that we should increase the
| quality of documentation until the one person just weird
| and nerdy enough out of a group of 20 will be able to
| deploy things on leftover hardware, and share with their
| friends.
|
| What do you think?
| shinryuu wrote:
| In terms of accessibility I don't think it would be bad
| per se if chatgpt.exe would be able to help you with
| that. Though both of us know that there is maintenance
| involved and once something catch fire (which will happen
| at some point), you are kind of helpless.
|
| Something like pikapods.com certainly helps with
| accessibility, even if it isn't self-hosting per se.
|
| But all of that doesn't have little to do with incentives
| or disincentives. Even with very high accessibility there
| are disincentives to self-host. It will cost time and
| money in some way. For some people the intrinsic
| motivation will override those disincentives. But I think
| for the majority of people there will still not be enough
| motivation to do it.
|
| There are more important things to do for them.
| bawolff wrote:
| I mean, how do you verify nodes are being honest and not just
| sending fake data for the free crypto (like what happened
| with seti@home and there wasn't even money involved)
|
| Not to mention, where is the value of this coin going to come
| from? Will people pay to use this search engine? That seems
| unlikely.
|
| It doesn't sound like the perfect use case to me.
| px43 wrote:
| By ignoring cryptocurrencies, you've missed out on over 10
| years of progress in this space. We have things like zero
| knowledge notaries and data availability sampling proofs.
| Actively Validated Services are also a thing. Service
| providers stake some asset, and interested parties can
| challenge them at certain intervals to ensure that they are
| properly performing their duties. Through the magic of
| Merkel Trees, and soon Verkel Trees (basically Merkel
| tress, but using vector commitments for super fast proofs)
| challengers can demand that that service providers generate
| a proof that some data they hold matches some criteria. The
| nice thing about it is that because it's a zero knowledge
| proof, the challenger doesn't even need to know what that
| data is, and what they get back is a succinct proof that
| they can check very quickly, basically like checking an md5
| some for execution correctness.
|
| It's cool shit, you should really look into it.
| worksonmine wrote:
| That's exactly why blockchain is a good choice. You verify
| that whatever X sends matches what Y and Z would send
| before any reward is received. Based on the shared index
| every query should return the same results, kids stuff
| really.
|
| The monetization is a nut to crack yes, but Kagi works as a
| paywalled search engine. Otherwise just serve ads like all
| the rest already do? Tried and proven model, and in this
| solution they could be very transparent as there's no
| corporation behind trying to dupe the users for clicks to
| maximize profits. I even see the possibility for a hybrid
| model, don't like ads? Pay for the compute with your own
| coins.
|
| The value comes from the network, trust and use-case. It
| doesn't have to be a new coin.
| numpad0 wrote:
| Agreed; feels to me that people here is underestimating
| malice on the Internet. Simple crypto-based search credit
| system will be overtaken with fake queries and fake data.
|
| I'm not entirely confident that crypto-like reward
| mechanisms for distributed search is fundamentally flawed
| and unusable, but both the problem and solution needs to be
| refined a bit more.
| worksonmine wrote:
| > Agreed; feels to me that people here is underestimating
| malice on the Internet.
|
| I don't think we do. We just prefer to put our trust in
| algorithms and verifiable data sources. It's not like
| Google et al are the pinnacle of altruism, there have
| been cases where the promoted results are faked copies of
| the actual site you want to visit, fooling less computer
| savvy users to install malware.
|
| The trust is put into the code, same principle as
| reproducible builds. It doesn't matter where you get the
| source, as long as the checksum matches. This way the
| censor side of the problem is solved.
|
| That leaves the spam, which isn't really solved by the
| big corporations either. Last time I used google I got
| 2-3 pages of the same auto-generated bullshit on every
| technical search term I tried. This could be fixed by
| having the main index limited to trusted sites at the
| expense of discovering new content. The latter can be
| handled by opt-in indexes. If the goal is to index
| everything users could have their own filters for sites
| they don't want.
|
| If you really want to spice it up allow me to maintain my
| own query function (dangerous and potential exploit yes)
| that I send to the nodes and I can handle my own ranking.
|
| There's nothing that makes a distributed index more
| unsafe than one run by Google. If every query picks 2
| random nodes and compares the results I would trust that
| query more than current Google execs opinions of what I'm
| allowed to see.
| lifty wrote:
| Not sure why it would be an abomination. This is the exact use
| case which is a fit for cryptocurrency networks.
| rakoo wrote:
| You have to look beyond the surface. Cryptocurrencies work
| specifically to address a system where no node can trust any
| other node. If I cannot trust any other node, why would I
| fetch anyone else's index, or ask them for the results of a
| query, or even talk to them ?
|
| Unless there can be a way to trivially verify what others
| tell you, crypto currencies are a dead end
| mhluongo wrote:
| You have that issue without cryptocurrencies as well, you'd
| just be relying on the kindness of users rather than crypto
| incentives.
|
| You always need a way to hold nodes accountable in a system
| like this, or it'll be rife with manipulation -- because
| there's already a strong, innate incentive to manipulate
| results. Today, we call that industry "SEO".
| rakoo wrote:
| What you don't understand is that "I don't trust others"
| is not a terminal statee I'd rather build trust again,
| create human connections, or rather, put them in front
| because there are always connections; nothing works if
| you trust noone.
|
| Building a societal system where you know you can rely on
| your peers, you build together, is a more joyful, more
| resilient, more ecological and also more realistic way of
| building a thriving society than distrust-by-default that
| cryptocurrencies live for.
| idiotsecant wrote:
| Your current fiat currency is not based on love and
| trust. Its Proof-of-World-Hegemony which puts crypto
| based consensus mechanisms to shame in terms of how not
| based on love and trust it is.
| rakoo wrote:
| My current fiat currency is absolutely based on trust
| that the State will uphold any disagreement, even though
| I know it is not benevolent.
|
| I also don't understand your point. The current world is
| not what I want, so let's make it worse according to my
| values ?
| idiotsecant wrote:
| The value of, for example, the dollar is not based on
| your trust, at least not at the first order. It's based
| on the economic and military power backing it up.
| rakoo wrote:
| Absolutely it is: it is based on the trust we all have
| that the government will do whatever it takes to
| guarantee the value of a dollar. Me being able to
| commerce with you in dollars and not, say, in old
| zimbabwean dollars rests on the shared assumptions that
| the US State can and will be there.
| Brian_K_White wrote:
| Sure it is. When someone gives me a dollar, I have no
| idea if it's fake or stolen.
|
| That sort of thing only gets handled very indirectly and
| much later and after a bad actor does their bad thing
| enough times for the surrounding greater population of
| good actors to notice a pattern.
| Brian_K_White wrote:
| And I think this is not even stupid either.
|
| Bad actors exist and there must be some process for
| identifying and dealing with them, but they are not the
| majority of people and so probably don't have to be the
| first, last, primary, and only consideration at all
| times.
|
| IE living in a bomb shelter is not a life worth living,
| even though yes you will be safe from bombs and theives.
| rakoo wrote:
| Exactly. If I'll have to depend on someone else anyway
| (and I will), might as well build trust because a life
| being cautious about everything and everyone is not worth
| living. Only those with already vast amounts of money can
| afford it because they trust (heh) other people working
| for them to taue care of that, but to non-jokingly
| propose it as a standard for everyone is a dystopia.
| lifty wrote:
| You should be able add incentives in the system so that
| people store the correct index. You can check the incentive
| design of Filecoin for an example of how you can do that.
| Obviously it depends on the application how the incentive
| mechanism should be built.
| rakoo wrote:
| Filecoin is "easy": it is trivial to verify that the blob
| you stored is the one I wanted you to store. There is no
| trivial way to verify that you indexed what I wanted you
| to index, or that you reply what I wanted you to reply.
|
| I highly dislike monetary incentives because they
| perpetuate inequalities by design, so here's another
| incentive: if you store a correct index, I will keep
| working with you and we can build an awesome system
| together. We can coordinate by talking to each other
| rather than trying to get money from each other.
| zubairq wrote:
| Interesting comment about how cryptocurrencies can enable a
| system where no node can trust any other node. Something
| for me to think about as I am building a peer to peer
| system (not a search engine though)
| rakoo wrote:
| Cryptocurrencies only help where no one can trust anyone.
| But if that's the case, I claim that such a system is not
| viable in the long term.
| zubairq wrote:
| Good point. Does this mean that Bitcoin is not viable in
| the long term?
| rakoo wrote:
| Bitcoin as a speculating tool lives as long as
| speculation can live. Bitcoin, or any cryptocurrency as
| an actual currency exchanged at large scale will not
| work, or at least not in a democracy.
| miohtama wrote:
| Cryptocurrencies solve spam problem, not trust problem. No
| one can spam the network with new write data (transactions)
| because spam would become expensive. Although people still
| do, and Ethereum is full of spam tokens, meaning the
| transaction cost is still too low. This was also the use
| case of Hashcash, predecessor in proof-of-work, and was
| designed to solve email spam.
|
| You are paying either
|
| - Block space: your transaction to be included in a block
|
| - State: modifying the world state (EVM in Ethereum)
|
| Trust problem is solved by various other means, usually on
| libp2p level, by banning node (IP addresses) that send you
| bad data, which you can verify by comparing it to data from
| other peers.
| dumbfounder wrote:
| They also solve the trust problem through consensus using
| proof of stake. If there is enough financial skin in the
| game to behave correctly, then that should be enough to
| make sure that results are not tainted.
| rakoo wrote:
| Cryptocurrencies slow down the rate of data not because
| of spam but because a slower rate means a higher
| consistency across the network: cryptocurrencies' goal is
| to agree on a consistent state with peers who do not want
| to negotiate. If the consistent state is pure garbage
| then that is not a problem for blockchains, because from
| blockchains' point of view, everything is fine.
|
| Spam is not a function of rate but of content. Spam can
| absolutely be sent in a blockchain, as you say, and
| making the price higher only makes both spam and non-spam
| more difficult. Spam for me might be actual legit
| information for you.
|
| Hashcash is another beast, it only has the proof-of-work
| part, not the money part (contrary to its name) so it's
| not comparable.
| mattdesl wrote:
| This seems like something that could be verified through ZK
| proofs. The data to search could be represented by a public
| merkle root, and the searching/indexing given the user
| query could be programmed in a ZKVM like RISC0[1].
|
| [1] https://www.risczero.com/zkvm
| notfed wrote:
| Most information is not a math equation.
| 6510 wrote:
| After that the issue becomes ranking. Should say became since
| LLM's could both rank pages and generate them on "demand" to
| fit the query.
|
| YaCy has so many buttons I'm not even sure if it lacks it but
| playing around with it it is very cool to crawl large amounts
| of pages and serve requests until you want to do other things
| with the computer and the background process is to bloated.
| Something like a turtle mode like torrent clients have would be
| useful.
|
| Long ago there was a Chinese p2p client with a rootkit that
| would seed at 1 kb. I haven't used it but was told it worked
| remarkably well.
| mdaniel wrote:
| Nothing new under the sun, as they say:
| https://www.presearch.io/engine and just as you said I was
| unwilling to run a closed-source node binary
| colinsane wrote:
| if the situation is really "nobody will run this software
| unless i pay them to", then you're doomed regardless. there's
| nothing wrong with the classic route: package your software for
| the stores/distros you're familiar with, make your software _as
| easy to package as humanly possible_ for anyone else who 'll
| come around, document the hell out of it, submit it to the
| handful of top-level news feeds from which it'll percolate, and
| then wait. maybe you don't like waiting?
| rasulkireev wrote:
| Love it. Super easy to self host and use. Now I have a personal
| Google!
| maxloh wrote:
| See also: Presearch, another decentralized search engine, claimed
| that it will be open source. No source code available at the
| moment though.
|
| https://presearch.com/
| b2bsaas00 wrote:
| Could this be used for a Torrent search engine?
| fddrdplktrew wrote:
| if it is not censored, probably?
| worksonmine wrote:
| Recently there was a distributed tracker on the front page.
| Probably more what you're looking for.
| BLKNSLVR wrote:
| Bit Magnet: https://bitmagnet.io
| rakoo wrote:
| Note that it's not a distributed tracker, it's an
| indexer/tracker/search engine that _uses_ distributed
| resources (the nodes in the dht)
| feverzsj wrote:
| btdig is still alive.
| qingcharles wrote:
| btdig has the data, but its search is subpar :(
| vGPU wrote:
| Has it gotten any better recently?
|
| I run a node but I haven't actually used it as a search engine in
| a while, as I found the result quality to be exceedingly poor.
| rahen wrote:
| I remember trying it for a while in 2012, but the results were
| essentially worthless, probably because there were so few
| nodes/crawlers back then. I guess the more users there are, the
| better the results.
| viraptor wrote:
| Alternatively, ignore the public network (it's still useless)
| and run it as your own crawler. Seed it with your browsing
| history, some aggregators like HN, your favourite RSS feeds,
| etc. and you'll be good.
| WarOnPrivacy wrote:
| > I remember trying it for a while in 2012, but the results
| were essentially worthless,
|
| I had mine crawling gov, mil, etc sties for pages that Google
| was starting to delist back then. Inbound requests were heavy
| with porn until I tweaked - IDK, something.
| Avamander wrote:
| No.
|
| Either it picks up too much garbage if you allow any P2P data
| exchange (can't allow only outgoing AFAIK) or it kinda only
| knows about the sites you know about. Which kinda defeats the
| purpose.
|
| Even assuming you just want a specific index for yourself of
| your own content then it struggles to display useful snippets
| about the results, which makes it really tedious to shift
| through the already poor results.
|
| If you try to proactively blacklist garbage, which is
| incredibly tedious because there's no quick "delete from index
| and blocklist" button under index explorer, then you'll soon
| run into an unmanageable blocklist, the admin interface doesn't
| handle long lists well. At some point (around 160k blocked
| domains) Yacy just runs out of heap during startup trying to
| load it which makes the instance unusable.
|
| It also can't really handle being reverse proxied (accessed
| securely by both the users and peers).
|
| It also likes to completely deplete disk space or memory, so
| both have to be forcefully constrained. But that ends up with a
| nonfunctional instance you can't really manage. It also doesn't
| separate functionality enough that you could manually delete a
| corrupt index for example.
|
| Running (z)grep on locally stored web archives works
| significantly better.
| bobajeff wrote:
| Those are pretty bad issues. I remember using it along time
| ago and only remember the results being bad. I've heard that
| Yacy could be good for searching sites you've already visited
| but it sounds like even that might not be a good use case for
| it.
|
| I do understand the taking up of disk space thing. It's hard
| to store text of all your sites without it talking up a lot
| of space unless you can intelligently determine which text is
| unique and desired. Unless you are just crawling static pages
| it becomes hard to know what needs to be saved or updated.
| RGBCube wrote:
| curl failed to verify the legitimacy of the server and therefore
| could not establish a secure connection to it. To learn
| more about this situation and how to fix it, please visit
| the web page mentioned above.
|
| Can't seem to access the page.
| gonesilent wrote:
| Infrasearch / Gonesilent sold to Sun turned into project JXTA and
| died.
| mdaniel wrote:
| While trying to read more about it, turns out there's an
| O'Reilly book, too: https://www.oreilly.com/library/view/jxta-
| in-a/059600236X/ch... and there's also this
| https://wiki.wireshark.org/JXTA _(I 'm guessing those
| specification links are in wayback but I didn't chase them)_
| charcircuit wrote:
| Are the results still being gamed by sites using content keyword
| stuffing? The last time I used it the searching and ranking
| technology felt like they were 40 years behind state of the art.
| liotier wrote:
| In distributed indexing, spam management seems a much bigger
| problem than the indexing itself.
| boyter wrote:
| I actually half wrote a RFC of a spec and 2 implementations of a
| federated search last year. Rather than do the disturbed hash
| table that yacy does.
|
| I wanted results to be re-rankable by the peers by sharing the
| scores that went into them. The idea being with a common protocol
| based on the ideas of ActivityPub you could get peers of searches
| working together to hopefully surface interesting things.
|
| Something I should probably finish and publish at some point. It
| worked to the hundreds of peers I tested.
|
| The reason I mention this is because I wanted to also add a front
| into yacy which tuned out to be harder than I expected. It's a
| wonderful project and you can find great stuff through it but the
| way the peers return results sometimes it's hard to find it
| again. It's also not quite as hackable as I would have hoped at
| the time probably due to he project age.
|
| I still think there is value in it though and I'd love to see
| yacy have its protocol explained as an apex so people could,build
| implementations in other languages more easily.
| detourdog wrote:
| I remember the first days of gopher browsing were like that.
| Gopher browsing to me was like swinging on vine to vine. The
| trick was remembering/documenting where each vine went.
| arboles wrote:
| Sort of hijacking the thread to ask, can YaCy or similar, be an
| alternative to Google's Programmable Search Engine? All I use it
| for is limit a search to a medium-sized list of domains. The
| aspect that makes running a search engine difficult on your own
| is lack of resources for crawling, I expect. But since I only
| care about a small list of domains, could I ditch Google's and
| run my own crawler like YaCy?
| gtirloni wrote:
| Is that the deceased code search tool?
|
| You could run Sourcegraph and import/sync those repositories.
|
| Or you could run your own ElasticSearch/Melisearch and crawl
| the websites yourself (if you're interested in things other
| than git repositories).
| arboles wrote:
| > Is that the deceased code search tool?
|
| No, it's _Programmable_. Though it 's not actually
| programmable. I should've written Custom Search Engine
| instead, that's also a name for it.
|
| cse.google.com - It's quaint that past the modern landing
| page, when using the search portal today, you still get some
| outdated iteration of Google UI design.
|
| It's used, for example, for making OSINT searches.[0] Or at
| some point by at least one Wikipedia editor for a custom list
| of Reliable Sources for Anime & Manga.[1]
|
| [0] https://www.osintme.com/index.php/2020/09/28/
|
| [1] https://gwern.net/me#wikis
| anthk wrote:
| Ugh, Java. I'll wait for something like i2pd does for I2P,
| something called yacyd either in c, c++ or golang.
| ravenstine wrote:
| What's your objection to Java?
| anthk wrote:
| High CPU and RAM usage.
| WarOnPrivacy wrote:
| Yacy's still around. Nice.
|
| After a year or two of hosting a Yacy instance (2014?) I started
| winding up on some general (probes, etc) blacklists.
|
| I also host a small mail server and I was getting mail returned.
| I'd force an IP swap and a few weeks later it'd be the same. I
| had to let Yacy go.
| 1oooqooq wrote:
| So that is how they block a people's search/crawler. Didn't
| thought they would use the most complicated method.
|
| They also use block lists to add every single TOR node (even if
| not an exit) and every VPN under the sun (except for streaming,
| because, why would them, that's why they exist)
| renegat0x0 wrote:
| There are already many project about search:
|
| - https://www.marginalia.nu/
|
| - https://searchmysite.net/
|
| - https://lucene.apache.org/
|
| - elastic search
|
| - https://presearch.com/
|
| - https://stract.com/
|
| - https://wiby.me/
|
| I think that all project are fun. I would like to see one
| succeeding at reaching mainstream level of attention.
|
| I have also been gathering links meta data for some time. Maybe I
| will use them to feed any eventual self hosted search engine, or
| language model, if I decide to experiment with that.
|
| - domains for seed https://github.com/rumca-js/Internet-Places-
| Database
|
| - bookmarks seed https://github.com/rumca-js/RSS-Link-Database
|
| - links for year https://github.com/rumca-js/RSS-Link-
| Database-2024
| fsflover wrote:
| But which of those projects are distributed and FLOSS?
| legrande wrote:
| Also these:
|
| https://swisscows.com/en
|
| https://search.disconnect.me/
|
| https://www.ecosia.org/
|
| https://metager.org/
|
| https://searx.space/
| ColinHayhurst wrote:
| https://www.mojeek.com/ self-disclosure, mojeek team member
| wongarsu wrote:
| To be fair, of those only Apache Lucene predates YaCy. YaCy is
| very mature, but in terms of relative popularity for general
| web search probably peaked around 15 years ago.
| buffalobuffalo wrote:
| I ran YaCy for a while, but not as a node on their distributed
| search index. I just ran it as a search engine for all my own
| bookmarks. Unfortunately I never found a particularly good way of
| getting bookmarks into the system. So eventually I shut it down.
| Cool idea in theory though.
| fortran77 wrote:
| Related to this -- I'd love to see individuals making web pages
| again, and federated search engines indexing them. People don't
| make their own hobby or fan or art websites anymore, and I think
| that's partly because nobody will every find them with the big
| search engines.
| emrah wrote:
| I think it would be nice if the search results were
| "distributed" rather than deterministic.
|
| So when i enter the same keywords, let's say there are 50 pages
| each of which would be equivalently good result for the search,
| rather than one page "winning", the search engine would
| alternate the winner among the many possibilities
| jrussbowman wrote:
| Nice to see search projects are still popping up. After a move,
| family life taking over and me getting more interested in Unreal
| Engine, my poor search engine is now more of an experiment in
| seeing how well it runs while basically on life-support
| maintenance updates I do. Starting to think I honestly should
| just take it down and save my $50 a month I spend maintaining it.
|
| But I'll post it in a hacker news comment and maybe you all will
| give it enough traffic I can get excited about it again, lol
|
| https://www.unscatter.com
| jrussbowman wrote:
| And for my immature moment of the day, the above comment was
| comment #69
| fho wrote:
| I've been using several times over the last decades and never got
| good results. I think one instance is still running on my old
| computer at uni :-)
| dredmorbius wrote:
| Previously:
|
| YaCy - your own search engine |
| https://news.ycombinator.com/item?id=32597309 | 2 years ago | 93
| comments
|
| YaCy: Decentralized Web Search |
| https://news.ycombinator.com/item?id=22246732 | 4 years ago | 41
| comments
|
| YaCy - The Peer to Peer Search Engine |
| https://news.ycombinator.com/item?id=17089240 | 6 years ago | 3
| comments
|
| YaCy: a free distributed search engine |
| https://news.ycombinator.com/item?id=12433010 | 8 years ago | 24
| comments
|
| YaCy: Decentralized Web Search |
| https://news.ycombinator.com/item?id=8746883 | 9 years ago | 29
| comments
|
| YaCy takes on Google with open source search engine |
| https://news.ycombinator.com/item?id=3288586 | 12 years ago | 17
| comments
| treprinum wrote:
| Is it worth dedicating 1-2 low power NUCs (4-8 core) to this on a
| 250MBit/s connection? Or does it need beefier CPUs/network?
| nairboon wrote:
| If you run YaCy with docker and it is still a junior peer, does
| the search return results from the global index or just the one
| that appears to be 'preinstalled'?
___________________________________________________________________
(page generated 2024-03-06 23:01 UTC)