[HN Gopher] Show HN: Nebula - A network agnostic DHT crawler
       ___________________________________________________________________
        
       Show HN: Nebula - A network agnostic DHT crawler
        
       Author : dennis-tra
       Score  : 50 points
       Date   : 2024-03-20 10:54 UTC (12 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | mikae1 wrote:
       | Unlucky naming collision with Slack's networking tool Nebula:
       | https://github.com/slackhq/nebula
        
         | rad_gruchalski wrote:
         | Oh no, what of an unfortunate event. The slack tool uses the
         | name already used by OpenNebula. /s
        
           | Nux wrote:
           | Not to mention they jinxed Slack(ware) for many.
        
         | sph wrote:
         | And the Nebula streaming platform. Which is unfortunate because
         | I'm using both.
         | 
         | I get it, nebulae are cool.
        
           | doublerabbit wrote:
           | Not forgetting the awesome Nebula game engine
        
       | pdabbadabba wrote:
       | I'm sure that this just because I'm not the target audience, so
       | intend only the very gentlest criticism. But I literally LOLed at
       | how completely incomprehensible this README was for me. It has
       | really been a while since I've read a paragraph and had literally
       | no idea what it was talking about. But here's the winner:
       | 
       | > A network agnostic DHT crawler and monitor. The crawler
       | connects to DHT bootstrappers and then recursively follows all
       | entries in their k-buckets until all peers have been visited.
       | 
       | Following the Wikipedia link for "DHT" yielded some clues. (Ah.
       | Distributed hash table.) But I've still been looking at this for
       | several minutes now and am basically just puzzled. But the graphs
       | are pretty! Reading the word "amino" a little further down threw
       | me off the scent for a bit. But I gather that is actually a
       | proper noun, and we aren't really talking about proteins here.
       | 
       | Maybe an initial sentence that makes fewer assumptions about the
       | reader's familiarity with the jargon would be helpful.
        
         | Chabsff wrote:
         | This is not a particularly egregious example, but it's kind of
         | spectacular how everything crypto adjacent revels in
         | technobabble.
         | 
         | The detractors of the ecosystem (myself included, to be honest)
         | will be quick to point out that obfuscating the tech as magic
         | as much as possible, as well as creating an inside group lingo,
         | is key to onboarding and retaining people into it. But it's
         | fascinating how that percolates throughout the dev community
         | behind it as well.
        
           | DanAtC wrote:
           | I feel the same way about AI/LLM lingo
        
         | PedroBatista wrote:
         | While I agree, there is a whole generation of people who know
         | what a DHT is, it's not really that obscure.
         | 
         | I'm talking about of course very late 90s, 00's P2P file
         | sharing, Kademlia, torrents but also later "eventually
         | consistent" databases ( remember those? )
         | 
         | The crypto 20 somethings hype beasts came way later.
        
         | nodja wrote:
         | A DHT is a decentralized key-value database, it's most famous
         | use being in the bittorrent protocols, it uses a routing
         | algorithm to guarantee that you can find the peers that can
         | retrieve the value of a known key, granted that you at least
         | know one peer in the network (even if that peer doesn't know
         | the value). Essentially the network is split into buckets and
         | it guarantees that you'll either will be already connected to a
         | peer that knows the value for the key, or that that peer knows
         | a peer whose bucket is closer to the key, you can then
         | recursively ask for peers that are closer and closer until you
         | find one that knows the key, as you do this search you keep
         | track of the peers so the next time you ask for another key
         | you're more likely to know a peer that is closer to it. A
         | typical DHT implementation has you keep track of hundreds of
         | peers to guarantee the robustness of the network.
         | 
         | One issue is that peers go offline and online all the time, so
         | the network is ever changing, if you turn off your client for a
         | week and then come back, your only hope is that at least one of
         | the peers you know is still online, if that's the case then
         | that's fine, if that's not the case, or you're starting the
         | client for the first time, then there's no way for you to
         | connect to the network and query for keys. In bittorrent this
         | is not an issue as most torrents include trackers, the original
         | centralized way of finding peers on the network, but it seems
         | that each project listed on this page has it's own separate DHT
         | network that doesn't connect to the main network (the one used
         | by bittorrent), so for you to connect to these networks for the
         | first time you need to use a bootstrap peer, this is just a
         | normal peer on the network that is known to be always online,
         | usually hosted by owner of the project, and it'll give you a
         | starting point to find other peers in the network.
         | 
         | What this project does in essence, is connect to a bootstrap
         | peer, then use the properties of the routing algorithm to
         | efficiently find out all the peers that are currently online.
        
       | dTal wrote:
       | Why is BitTorrent not supported? Perhaps I'm misunderstanding
       | this thing but it seems like application #1.
        
         | DanielVZ wrote:
         | Because this seems to cater to the cryptocurrency/blockchain
         | culture.
        
         | ogurechny wrote:
         | My guesses:
         | 
         | a) Many other tools exist for that.
         | 
         | b) Bittorrent DHT modes are simple and interchangeable. They
         | can give you a list of peer addresses associated with certain
         | (torrent) hash -- and only if you know the exact hash. Even
         | client versions can't generally be collected (apart from some
         | protocol extensions). The only thing you learn about DHT member
         | is that it exists. On the contrary, this project is for
         | heterogeneous networks in which peers announce various
         | services.
         | 
         | c) Number of Bittorrent DHT nodes is... bigger.
         | 
         | d) To collect interesting data from Bittorrent DHT, one needs
         | to observe as much third party torrent hash requests as
         | possible. To do that, multiple nodes are needed. Moreover, they
         | need to run for a long time, not just because it takes time to
         | make a lot of requests to a lot of nodes, but also because of
         | external preference for long-running nodes. Not sure how
         | important it is, but, anecdotally, a fresh DHT node sees twice
         | as much requests after a week than after a day.
        
         | jzm2k wrote:
         | Looks like Nebula uses go-libp2p and all of the supported
         | networks listed in the README use libp2p for their p2p
         | networking. Mainline DHT doesn't support the same transport
         | protocols that libp2p supports (such ash TCP+Yamux+Noise) which
         | is probably why Nebula doesn't support Bittorrent
        
         | crotchfire wrote:
         | Because it isn't really network-agnostic.
         | 
         | It only supports IPFS and derivitaves thereof.
        
       | ogurechny wrote:
       | /me remembers various DHT views, traffic flows, client stats,
       | graphs and other data decorations in Azureus. Now that's what I
       | call a dashboard.
        
       | crotchfire wrote:
       | It isn't really network-agnostic... in fact it doesn't support
       | the (by far) largest DHT out there, the Mainline DHT that
       | bittorrent uses.
       | 
       | This is just a crawler for DHTs that use IPFS's implementation,
       | or at least smell very similar to it.
        
       | pedalpete wrote:
       | Can someone explain why we want to crawl and/or monitor? What is
       | this used for?
       | 
       | When I think of a crawler, I think of a non-homogonous network
       | (if that is the right term).
       | 
       | But with the blockchain, isn't it the case that each node has an
       | entire copy of the blockchain, so you don't need to "crawl" it,
       | it works more like a database.
       | 
       | What am I not understanding about this?
        
       ___________________________________________________________________
       (page generated 2024-03-20 23:01 UTC)