[HN Gopher] Show HN: Every single torrent is on this website
___________________________________________________________________
Show HN: Every single torrent is on this website
Author : tdjsnelling
Score : 103 points
Date : 2025-09-29 16:14 UTC (6 hours ago)
(HTM) web link (infohash.lol)
(TXT) w3m dump (infohash.lol)
| avidiax wrote:
| > Many crawlers and indexers continuously pick random or
| sequential infohashes and announce themselves so they can later
| detect other announcers
|
| I can't follow the logic here. How does this detect other
| announcers?
| aspenmayer wrote:
| The way I understand it, these extraneous infohashes are
| functional honeytokens.
|
| https://en.wikipedia.org/wiki/Honeytoken
|
| > In the field of computer security, honeytokens are honeypots
| that are not computer systems. Their value lies not in their
| use, but in their abuse.
| avidiax wrote:
| So they are basically detecting bots that indiscriminately
| try to download any detected infohash, right?
|
| That's not detecting "announcers", but maybe more like
| detecting "indexers".
| aspenmayer wrote:
| > That's not detecting "announcers", but maybe more like
| detecting "indexers".
|
| I think you're correct, as the secondary freebooting
| indexers are adding their tracker(s) after the fact of the
| private torrent's creation/origination to the original
| prefilled list of trackers, and inserting their tracker(s)
| to the reuploaded, usually public, torrent, and sometimes
| even removing the original private trackers so as to not
| phone home and tell on themselves.
|
| I'm happy to be corrected, but private trackers typically
| bind the downloading IP of the torrent to the announcing
| tracker to validate legitimate clients. Private trackers
| don't consider any extra trackers (announcers in this
| context) as valid or authorized. I have heard that modded
| BitTorrent clients can intentionally misreport upload stats
| to fudge the numbers for gaming your quota, as many private
| trackers/torrent sites enforce a positive >1.0 or higher
| minimum ratio.
|
| I've heard of ways that folks with legitimate access to the
| private torrent tracker and torrents clone the IPs of other
| clients and then use a secondary torrent client to request
| blocks, bypassing the tracker entirely and not reporting
| any downloads (or uploads, for that matter), so the quota
| of the first legit client is not affected positively or
| negatively.
| tdjsnelling wrote:
| By announcing itself, the indexer makes itself more likely to
| be handed out as a peer to anyone else interested in that
| infohash. Every connection attempt it subsequently receives is
| evidence of another peer announcing or joining that torrent. In
| effect, it "baits" peers into revealing themselves
| gwbas1c wrote:
| I think this would be an even better joke if the site was a setup
| for plausible deniability for piracy.
|
| "I didn't share that! It was on infohash.lol first!"
| bArray wrote:
| Does anybody know what they are using in the browser to perform
| DHT?
|
| In theory this could be used to share torrent links by a
| different reference (ideally you could also add an anchor too).
| Somebody else could have a page that takes keywords and points
| you to pages hosted on the site.
| tdjsnelling wrote:
| https://www.npmjs.com/package/bittorrent-dht is used on the
| server.
|
| DHT crawlers/indexers already exist to perform that function;
| they crawl and store infohashes (+ metadata when they receive
| it) and allow users to search that metadata to return relevant
| infohashes
| crumpled wrote:
| The page is making a WebSocket connection to the server and
| getting the peer info through the WebSocket connection. I think
| the magic happens on the server.
|
| This is a sample of the client-side code I found handling that:
| https://infohash.lol/_next/static/chunks/pages/p/%5Bpage%5D-...
| hackingonempty wrote:
| > There is no validation that an infohash corresponds to a real
| torrent--any client can announce anything. Many crawlers and
| indexers continuously pick random or sequential infohashes and
| announce themselves so they can later detect other announcers,
| and malicious clients or poorly written bots can spam the network
| with anything they like.
|
| There are also valid clients for completely unrelated protocols
| using the BitTorrent DHT to find each other.
| sneak wrote:
| Which? I'm always fascinated by the use of public p2p nets to
| serve other protocols. The first complete standalone program I
| wrote was a gnutella p2p client.
| 1dom wrote:
| I have the same fascination. You might find
| https://github.com/dmotz/trystero quite interesting - it's
| fun to play around with, also can use torrent DHT for
| discovery.
| pluto_modadic wrote:
| https://github.com/pubky/pkarr is another one
| recursive wrote:
| I don't understand why so many people seem so fascinated by
| constructions like the library of Babel. Yes it contains the
| answers to all your questions, but there are some significant
| drawbacks.
|
| * It has more wrong information than right information, with no
| way to tell the difference.
|
| * If you had an oracle that could tell you how to get to the book
| you need, the navigation instructions to _get to_ the book will
| be at least as long as the book, on average.
| Llamamoe wrote:
| I wonder if there is some way to create a latent-space Library
| of Babel in which you only find incoherent gibberish with
| extremely long keys, with the shortest ones pointing
| specifically to the most common/likely strings of text, in
| manageable computational complexity.
| recursive wrote:
| Reproducing the text of a book in the library is a synonym
| for identifying the book. So this is really called "text
| compression", which is a well-studied field.
| samsartor wrote:
| In a library of all possible strings, this is just text
| compression (as the other comment observes). But in a finite
| library it gets even simpler, in a cool way! We can treat
| each text as a unique symbol and use an entropy encoding (eg
| Huffman) to assign length-optimized key to each based on
| likelihood (eg from an LLM). Building the library is
| something like O(n log n), which isn't terrible. But adding
| new texts would change the IDs for existing texts (which is
| annoying). There might be a good way to reserve space for
| future entries probabilistically? Out of my depth at this
| point!
| lxgr wrote:
| That's arguably just a regular library :)
| cryzinger wrote:
| To your first bullet, I believe this is one of the central
| points of the original Borges story :)
| cantor_S_drug wrote:
| I think Library of Babel by Borges is a static manifestation
| of Turing complete behaviour via the fact that some L-systems
| are Turing complete. or put another way. Where in the Library
| of Babel, does the real Hamlet reside? If we consider finding
| and replacing names with other names, is it still a Hamlet?
| And if we bring the full force of edit operations and do
| these in a reversible manner, then where does the actual
| Hamlet reside? An equivalence class of Hamlet?
| a_shovel wrote:
| Another way of looking at it is that the library of Babel would
| be less useful than an equivalent quantity of blank paper. For
| example, you could use it to print books in English instead of
| gibberish. Multiple copies of those books, even.
| bonoboTP wrote:
| The Library of Babel made me aware that choosing/finding is not
| super distinct from making/creating. Or discovery and
| invention. In math, there is distinction between "there exists"
| and "we can construct", but "we can construct" is similar to
| "we can find".
| matheusmoreira wrote:
| I don't think they're equivalent. I think invention and
| creation aren't actually real. There is no "making" or
| "creating" when it comes to intellectual work.
|
| All computer files are sequences of bits. All sequences of
| bits are integers. All integers already exist in the infinite
| set of natural numbers. I can even calculate how big those
| numbers are given their bit count.
| digits(bits) = ceil(bits * log10(2)) digits(32)
| = 10 digits(64) = 20 digits(128) = 39
| digits(256) = 78 digits(512) = 155
| digits(1024) = 309 digits(20 KiB) = 49,321
| digits(2 GiB) = 5,171,655,946
|
| We are merely discovering numbers through convoluted mental
| and technological processes. All our mental exertions result
| in the discovery of a number. This comment is a number.
| synctext wrote:
| How to find a nice SHA1 hash? How do keyword search in this
| list? Search and discovery of quality are unsolved
| scientific challenges. Fascinating stuff.
|
| At our university lab we've been working on this for 25
| years. Building a search engine is the easy part. Keeping a
| federated server with a billion users running is unsolved.
| Creating a fully -serverless- decentralised search engine
| is possible, you also need self-funding economy. Seems
| we're one of the few labs worldwide to still make actual
| operational prototypes of this stuff. More shameless self
| promotion:
|
| "SwarmSearch: Decentralized Search Engine with Self-Funding
| Economy" [0]
|
| Really handy to have s search engine to search this webpage
| with 45,671,926,166,590,716,193,865,151,022,383,844,364,247
| ,891,968 pages and the rest of the web (no spyware, no
| tracking).
|
| [0] https://arxiv.org/abs/2505.07452
| lurk2 wrote:
| If you're interested in mass market adoption rather than
| just proving the theory, you will need to change the
| name. "LimeWire" is fun. "SwarmSearch" sounds like a
| biblical plague.
| jama211 wrote:
| I would say that that's a valid _model_ we can use to
| describe creation, much like how maths is a model we use to
| describe the universe. However, whether maths IS the
| universe or creation IS discovery are more of a
| philosophical question, possibly an unanswerable one, that
| people will have many varying opinions on.
|
| And that's without me asking you to define "real", which
| would be another rabbit hole.
| bonoboTP wrote:
| Yes, I mean exactly this type of insight. Basically taking
| a digital photo with a camera technically also just picks
| out the "address" of your current environment within the
| space of all images. Any 4K 2-hour-length feature film in a
| digital format is also just an address in the space of all
| possible videos. The director, the actors, the whole crew
| did all that work in order to select that point from the
| space of possibilities, they didn't "create" anything. That
| movie already existed.
|
| Of course this is silly, but interesting nonetheless. And
| we routinely speak about such high-dimensional spaces in
| research and engineering. Or we can imagine optimization as
| traversing a pre-existing search space. It may be
| structured as a graph or perhaps a Euclidean space. And in
| that space we can imagine a loss surface, that sits there
| in peace all along, with its global minimum somewhere. And
| instead of "constructing" a solution, we are simply hiking
| in this space and trying to spot that valley. But this is a
| bit fictional. We never physically "instantiate" this
| surface. It's an imagined abstraction. In reality we just
| have a vector and some rules as to how we change that
| vector. But we can imagine those changes to be movements in
| an imagined space.
|
| It's like the idea that the sculptor doesn't create the
| sculpture, the sculpture was there all along, he just had
| to remove the superfluous matter to reveal what was already
| there (i.e. the atoms belonging to the final sculpture).
|
| The most interesting thing is kind of on the border,
| between these absurdly large spaces and the more manageable
| ones that are feasible to enumerate.
|
| Another similar mindblow thing was when I forgot the
| password to a file that I encrypted. It's a fascinating
| thing that the bit pattern on the disk is functionally
| random now, and cracking it would take longer than the age
| of the universe. But if only I knew the password, it would
| only take just a second. There is a definite sequence of
| keystrokes I can execute to bring the universe in a state
| where the content will appear on my screen, it's so close,
| yet it's so-so far if you don't remember the password. Just
| a little difference in your brain state and it flips from
| trivial to hopeless.
|
| PS, if you like thinking about such things, I recommend
| _Meta-Math_ by Gregory Chaitin, it 's very fun (providing
| an address VS constructing the thing is basically the gist
| of algorithmic information theory).
| matheusmoreira wrote:
| Yeah I agree with you.
|
| > It's like the idea that the sculptor doesn't create the
| sculpture, the sculpture was there all along, he just had
| to remove the superfluous matter to reveal what was
| already there (i.e. the atoms belonging to the final
| sculpture).
|
| I understand this argument but I have far more trouble
| applying this logic to real things. I'm not sure the same
| logic applies once the information is instantiated in the
| real world as a physical object. I haven't thought very
| deeply about it. I think the true sculpture exists only
| in the ideal world and the real world object is merely an
| approximation of it.
|
| > Of course this is silly
|
| It's an existential issue for me. At some point it became
| a political issue. I became a copyright abolitionist
| because of this insight. Copyright is logically reducible
| to monopolistic ownership of numbers. The sheer absurdity
| of it led me to reject the very idea of intellectual
| property as delusional nonsense.
| saghm wrote:
| I'm not sure the law has ever been concerned with logical
| reducibility. Context that can't easily been defined
| objectively has always been a part of legal systems, and
| arguably is a feature rather than a bug. Stuff like the
| "reasonable person" standard are intentionally flawed
| concepts that allow laws to exist without needing to
| define every possible permutation of human behavior up
| front. This obviously doesn't mean that you won't
| necessarily look at everything and decide to be an
| anarchist because of how convoluted it all is, but I
| don't think that being mathematically inconsistent is
| particularly unique to copyright in the legal system.
| bonoboTP wrote:
| Exactly, it's a common failure mode for math/programming-
| minded people when encountering the law. But the law is
| not like a compiler, mechanically following some fully-
| specified set of rules.
|
| The legal system is rather the spiritual successor of the
| original "system" where a wise Solomon-like elder would
| adjugate the issue based on their best judgment and
| intuition and customs, ideally seeking peace and social
| satisfaction and future harmony. Codified law channels
| this into some more pre-shaped form, but the fuel of the
| legal system is still the human judgment and common sense
| at the core. Often the law basically just prompts and
| nudges the judgment of the jurors or judge to a certain
| direction, but it can't account for all corner cases. The
| nerd mind asks ok ok but what if X, where do you draw the
| sharp line between X and Y? It doesn't matter. If it
| comes up, a court will decide it based on all available
| common sense and the implicit values of the culture.
|
| In the cases where someone seemingly gets away with
| "rules-lawyering", then it's not purely their genius
| logic-brain that wins, but there is some kind of slanted
| playing field that's not really available to you. Of
| course the line between "annoying rules-lawyering based
| on literal interpretation of technicalities that
| obviously nobody intended to be interpreted so" and
| something that was not anticipated initially but does fit
| within the rules. This decision itself is based on
| judgment and intuition. In life, sometimes coming up with
| a "technically works" thing is rewarded and lauded (math
| proofs, pathological counterexamples, cracking an
| encryption library via side-channel attacks), other times
| you get an eye-roll and that's obviously cheating and
| wasn't meant (e.g. courts of law and fun at parties).
| BobbyTables2 wrote:
| Reminds me of the DeCSS t-shirts from back in the day...
| skydhash wrote:
| I'm close to you on that opinion, but there's another
| factor: Life and its sustenance. There's a lot of
| mechanisms in the body to ensure that life continues,
| including pain and desire. But the fact is resources that
| sustain life are finite. There's a lot of proxies for the
| act of acquiring such resources and laws like copyright
| is the legal framework for these proxies.
|
| It's basically creating value out of nowhere in lieu of
| resources that are truly valuable, but inconvenient to
| trade directly. But then like a metrics that got
| corrupted (I forgot the name of the law for that), there
| are other that are trying to game the system (and
| succeeding) so that they can maximize their share.
| bonoboTP wrote:
| Copyright is not "ownership of numbers". "Intellectual
| property" is a misnomer. Copyright is an instrumental
| tool to achieve specific socially desirable things,
| namely the flourishing of scientific and artistic
| activity. It's a relatively modern creation, born of
| enlightenment-style principles in the 18th century. If it
| were still used according to that spirit, we'd have less
| problems.
| ghc wrote:
| I admit thinking this way is tempting, but in your model
| the number represents some kind of language, whether human-
| readable or machine-readable. If we accept the number is a
| non-lossy encoding of some language, we reach an
| equivalency stating there is no creating, just discovering
| language "through convoluted mental and technological
| processes". But can we really equate language and
| knowledge? I believe Godel proved that we cannot, in the
| sense that there is no "perfect" way to encode knowledge in
| a system of consistent axioms. Ergo, no matter how
| eloquently you describe your invention of "the wheel", it
| is by its nature incomplete and imperfect. Some part of the
| knowledge will always be tacit.
| bonoboTP wrote:
| > Some part of the knowledge will always be tacit
|
| See also https://en.wikipedia.org/wiki/What_the_Tortoise_
| Said_to_Achi...
| jimbo808 wrote:
| This conflates mathematical existence with actual
| instantiation. A 2gb integer might be definable, but until
| someone encodes a particular arrangement of bits and gives
| it context, it doesn't exist in any practical sense. We
| don't treat all future novels as "already written" just
| because their ASCII codes can be mapped to integers.
| matheusmoreira wrote:
| I said all novels already exist. That's different from
| claiming all novels have already been written.
|
| The claim is that humans are not "creators" but
| _generators_ , very much in the random number generator
| sense. We are interesting number generators.
| AnthonyMouse wrote:
| > If you had an oracle that could tell you how to get to the
| book you need, the navigation instructions to get _to_ the book
| will be at least as long as the book, on average.
|
| This isn't quite true. Natural language text compresses
| extremely well and you would only need length equivalent to the
| compressed form, not the original form. And if you wanted to go
| further, you could use a mapping where extremely short strings
| map to known popular books and only unknown works have longer
| encodings.
| recursive wrote:
| I suppose this would work if the library was arranged such
| that comprehensible books were closer to the "origin". The
| workings of the "real" library of babel are supposed to be
| more inscrutable though.
|
| But if _I_ built one, it would totally work that way.
| variadix wrote:
| Kolmogorov's library
| 0cf8612b2e1e wrote:
| I am reminded of this SMBC comic
|
| https://www.smbc-comics.com/comic/the-library-of-heaven
| megablast wrote:
| Thank you captain obvious.
| recursive wrote:
| At your service.
| Chinjut wrote:
| Everyone is aware of this. Sites like this aren't created to be
| useful. They are created to be an amusement, a joke.
| wongarsu wrote:
| For a more practical version (containing only infohashes that are
| observed on the dht) there is bitmagnet [1]. No public instances
| though, you have to self-host
|
| 1: https://github.com/bitmagnet-io/bitmagnet
| skoll43 wrote:
| how to go straight to jail 101
| wongarsu wrote:
| You are only downloading metadata, and csam content is
| filtered. But yes, I would also rate it as a legally risky
| activity
| IlikeKitties wrote:
| > csam content is filtered
|
| Filtered how? By some keywords I don't want to know? What
| about encrypted zips of CSAM? There's no way to filter that
| in reality.
|
| If you want to learn more about why and you can either
| speak German or can handle youtubes auto translate i
| recommend this documentation on the matter[0]. The Pedo
| Criminals are using scene methods to share their illegal
| content.
|
| [0] https://www.youtube.com/watch?v=Ndk0nfppc_k
| wongarsu wrote:
| Yes, a simple keyword list in the classifier, matched on
| the torrent name and file names. Easy enough to find in
| the source if you look for it. That filter won't help
| against people uploading CSAM as documents.7z. But any
| filter that would want to do something against that would
| require downloading the content, which would be even more
| illegal (in addition to being wildly impractical)
| knowaveragejoe wrote:
| Would it matter if it's metadata-only until you download?
| jasonfarnon wrote:
| why not just exclude encrypted zips?
| wongarsu wrote:
| bitmagnet only has the info you get by looking up the
| infohash in the dht, which is basically the same info
| that's stored in a .torrent file: a name, a list of files
| with offsets and paths, and a bunch of block hashes.
| That's not a lot to go on, and e.g. doesn't tell you if
| the zip is encrypted
|
| I guess you could filter all torrents that include just
| zips/rars/7zips. That would exclude a lot of harmless
| content. Probably too much harmless content to make it a
| default, but if you only care about hollywood releases it
| would be a useful filter
|
| If there was a public list of hashes of (8/18KiB blocks
| of) CSAM content that would be useful for a filter, but I
| don't think such a thing exists
| IlikeKitties wrote:
| > If there was a public list of hashes of (8/18KiB blocks
| of) CSAM content that would be useful for a filter, but I
| don't think such a thing exists
|
| But wouldn't that just be a list of CSAM to look up?
| sorenjan wrote:
| Does running an indexer and crawler help make the content
| available to others, or why would this be legally risky? Why
| would anyone care about what kind of Docker container I run
| on my home server?
| throwaway894345 wrote:
| Is this legal? I'm of the impression that publishing infohashes
| to copyrighted content is illegal under DMCA?
| pessimizer wrote:
| The site doesn't publish any, except the two legal torrents
| that are on the front page. Any others you have to either
| request specifically, or are simply randomly generated.
| freetonik wrote:
| Assuming the web server does not actually store and serve pages
| in a conventional sense, but rather acts like an application
| that can render the results of parsing and processing user's
| input, I wonder what are legal implications.
|
| I can generate a Google link with an infohash in the same
| fashion:
| https://www.google.com/search?q=1548262051907755713575797913...
| reorder9695 wrote:
| I wonder how hosting a torrent is different to google showing a
| link to a pirated movie, both are just holding data that tells
| you where to find the content, not the content itself
| akimbostrawman wrote:
| neither "hosts" the content. they both just point to the
| destination with the content.
| throwaway894345 wrote:
| I think Google is expected to abide DMCA takedowns in such
| cases, but IANAL. My understanding is that even an indirect
| reference (such as a link or infohash) is a DMCA violation.
| weberer wrote:
| That was The Pirate Bay's defense and... they're still
| around.
| lxgr wrote:
| It's probably as illegal as any other random number generator.
| akimbostrawman wrote:
| it is. same as with URLs the infringement is the actual
| copyrighted content not the pointing to it.
| ratelimitsteve wrote:
| the infohash isn't copyrighted, so it's not illegal information
| in and of itself. serving the infohash isn't serving the
| torrent, and serving the torrent is also not serving
| copyrighted material. I believe that downloading is still
| illegal absent a fair use exemption but it's rarely prosecuted
| because you have to prove the absence of the exemption. It's
| uploading copyrighted content that's actually illegal and also
| easy to prosecute, so it's seeders that usually get bopped.
| freetonik wrote:
| Love this idea of generating pages based on some strictly defined
| enumeration. Reminds me of https://everyuuid.com/
| tdjsnelling wrote:
| Me too. That's listed as an inspiration on the index page!
| zikduruqe wrote:
| Or every bitcoin public and private address.
|
| https://keys.lol
| mikepurvis wrote:
| I wonder how many times on average you'd need to click the
| "random" button in order to stumble on a page that contains a
| real torrent.
| lxe wrote:
| So there is almost zero chance that opening up a particular page
| is going to land on an actual torrent.
| ratelimitsteve wrote:
| shades of my younger days on kazaa, excitedly download a file
| called 'hacking-tool-every-possible-ip-address.txt"
| mk12345 wrote:
| Very cool, reminds me of the library of Babel (of which you also
| made a version! [1]).
|
| I made something similar a while ago, the Hdd of Babel [2], which
| contains all possible files(*) , and wrote down some thoughts on
| it [3].
|
| I really like how it makes us think about the nature of
| information.
|
| [1] https://libraryofbabel.app/
|
| [2] https://mkaandorp.github.io/hdd-of-babel/
|
| [3] https://dev.to/mkaandorp/this-website-contains-pictures-
| of-y...
___________________________________________________________________
(page generated 2025-09-29 23:01 UTC)