[HN Gopher] Textfiles
       ___________________________________________________________________
        
       Textfiles
        
       Author : the-mitr
       Score  : 120 points
       Date   : 2024-01-20 13:45 UTC (1 days ago)
        
 (HTM) web link (www.textfiles.com)
 (TXT) w3m dump (www.textfiles.com)
        
       | ChrisMarshallNY wrote:
       | Great site. Jason Scott is a pretty influential chap.
       | 
       | My fave is the Goatse Files:
       | https://news.ycombinator.com/item?id=22997724
        
         | kleiba wrote:
         | He also made some great documentaries, as I'm sure a lot you
         | guys will know, amongst which:                   - Get Lamp:
         | https://en.wikipedia.org/wiki/Get_Lamp         - BBS: The
         | Documentary https://en.wikipedia.org/wiki/BBS:_The_Documentary
        
           | ghaff wrote:
           | They're both great. Get Lamp also has an Infocom-specific cut
           | and BBS really captures a lot of interviews with often pre-
           | Web creators and other information that would otherwise be
           | pretty obscure at this point. BBSs were sort of a niche at
           | the time and, while there are a couple memoir-type things
           | (and the textfiles archive of course) there aren't a lot of
           | authoritative sources for the history especially if you don't
           | have a lot of context to make sense of what is out there.
        
         | shpx wrote:
         | This guy is just a bad person. It's a classic case of someone
         | going so deep into understanding something that it detaches
         | them from reality and their humanity. You understand the
         | general structure of the Web (hotlinking) and how to "mv" which
         | took you months or years of concentrated study to learn and set
         | up, and you forget that under normal circumstances you wouldn't
         | give in to your sadistic urge to show children a picture of a
         | man's asshole.
         | 
         | And then all the other psychos, like you, come to the retelling
         | of the story to fantasize about doing it too.
        
           | ChrisMarshallNY wrote:
           | _> And then all the other psychos, like you, come to the
           | retelling of the story to fantasize about doing it too._
           | 
           | Well, isn't that special?
           | 
           | I guess you were one of the folks that got goatsed. Not my
           | fault.
        
       | inciampati wrote:
       | These files remind me of my personal nightmare of unawareness
       | learning about the world through the early web. It was awash in
       | these, many of which I read in many forms, including text and web
       | conversions. It was so hard to know what was real and imagined.
       | You could say that that's still true, but I'd argue we have more
       | power than ever to know. The information content of our
       | communication systems is just fantastically higher. From 8 bits
       | to the moon.
        
         | jszymborski wrote:
         | Kinda makes me think of applications of Deep Learning research
         | atm.
        
       | notRobot wrote:
       | Ahaha, I discovered this website as a teen and actually learnt a
       | lot about sex and sexuality from text files on it. Fun stuff!
        
         | 6c696e7578 wrote:
         | There are certainly worse places on the internet
        
       | keepamovin wrote:
       | Just want to note that Google Chrome headless recently landed
       | support for correctly rendering text files. Prior to this, when
       | in headless the file was not rendered, and IIRC was simply
       | downloaded, unlike what happens in regular headful chrome.
       | 
       | As a workaround before this fix, I created a secure document
       | viewer that used pandoc and latex to template layout the text
       | file in correct form. We intercepted the navigation request /
       | download request to a text file and passed it to the secure
       | viewer to be rendered using latex and pandox -- most complex
       | pipeline ever for displaying ASCII!! hahaha :)
       | 
       | Interested readers can find the latex template here^0 and the
       | overall projec this was part of, here^1 :) haha :)
       | 
       | 0:
       | https://github.com/BrowserBox/BrowserBox/blob/e437155fb582cc...
       | 
       | 1: https://github.com/BrowserBox/BrowserBox
        
       | bicx wrote:
       | Hah, found this for the content of armstech.txt [Jane's Fighting
       | Ships 1990-1991 (Specifications of Warships)]:
       | 
       | ```
       | 
       | From: Janes Copyright <copyright@janes.com>
       | 
       | To: "jason@textfiles.com" <jason@textfiles.com>
       | 
       | Sender: "Ward, David" <David.Ward@janes.com>
       | 
       | Date: Wed, 21 Jul 2010 03:38:46 -0600
       | 
       | Subject: Unauthorised hosting of Jane's Fighting Ships data
       | 
       | Dear Mr Scott,
       | 
       | I bring to your attention that your website (www.textfiles.com)
       | is hosting information that is the copyright of IHS Global
       | Limited.
       | 
       | Whilst I understand that the textfiles.com site is not hosting
       | this data for any monetary gain you will, I am sure, understand
       | that copyright exists over this data and that IHS Global Limited
       | has a strong interest in ensuring that the data is available only
       | through its own channels and through its own brands.
       | 
       | The data in question is from the 1990-1991 edition of Jane's
       | Fighting Ships and can be found in the text file 'armstech.txt',
       | which is located at the following location on the textfiles.com
       | site, http://www.textfiles.com/fun/ armstech.txt.
       | 
       | As stated earlier, the data held within this file is the
       | copyright of IHS Global Limited, owner of the Jane's Fighting
       | Ships publication and brand, and is available only to its
       | subscribers; by hosting this data and making it free to download
       | you are in breach of international copyright laws.
       | 
       | I therefore ask you, as proprietor of textfiles.com, to remove
       | this data from the www.textfiles.com site and any associated
       | mirror sites within the next 7 days from the date of this email
       | and confirm your action by reply. Failure to take action may
       | result in this matter being placed in the hands of the IHS Global
       | Limited legal team and further action being taken against you.
       | 
       | Yours sincerely, David Ward
       | 
       | David Ward
       | 
       | Head of Production Operations
       | 
       | IHS Jane's
       | 
       | IHS Global Limited, Sentinel House, 163 Brighton Road, Coulsdon,
       | Surrey CR5=
       | 
       | 2YH, United Kingdom
       | 
       | Phone: +44 (0)20 8700 3874
       | 
       | Email: david.ward@ihsjanes.com<mailto:david.ward@ihsjanes.com>
       | 
       | Web: www.janes.com and www.ihs.com
       | 
       | ```
        
         | xenophonf wrote:
         | I thought data itself couldn't be copyrighted. I wonder what
         | the original file contained.
        
           | ghaff wrote:
           | It probably depends what was in the file other than just
           | specifications. In any case, probably not worth fighting
           | about.
        
           | 0xEF wrote:
           | Many think otherwise and will pay a lot of money to make that
           | true.
        
         | jsploit wrote:
         | Original content:
         | https://web.archive.org/web/20090502094347/http://www.textfi...
        
       | arch-choot wrote:
       | Related: https://news.ycombinator.com/item?id=22995008
        
       | Kwpolska wrote:
       | > TEXTFILES.COM has been online for nearly 25 years with no ads
       | or clickthroughs. If you feel like donating to its roughly $1200
       | yearly upkeep: Paypal or Venmo.
       | 
       | Performance-wise, this doesn't seem like a $100/month site,
       | considering it's only simple HTML and text files.
        
         | TehCorwiz wrote:
         | It's probably outgoing bandwidth costs.
        
           | mmcdermott wrote:
           | It probably gets mirrored a lit too, given the target
           | audience.
        
             | ghaff wrote:
             | As I recall, Jason asks people not to do wholesale site
             | copies in general, but I imagine a great number do.
        
               | 0xEF wrote:
               | You imagine correctly. I cruise open directories
               | regularly, usually looking for books to add to my growing
               | collection (note: books I will actually read/use),
               | generally taking only 2 or 3 books with me when I leave
               | the site. At least once a month, I head to an OD I like
               | to grab a new book or two only to find the host shut the
               | gates because it got posted on r/opendirectories and a
               | bunch of people did pointless site rips, as though
               | they'll read 40k pdf files or whatever. I often wonder if
               | they do it because data caches like that are small
               | currency on other parts of the Internet, sort of the way
               | you had to maintain an ul/dl ratio to stay on a well-
               | maintained BBS back in the day. Who knows.
               | 
               | They're certainly not rehosting the libraries in any
               | useful way, I can say that much.
        
               | ghaff wrote:
               | I think a lot of it is just that many people have a
               | pretty deeply ingrained data hoarding impulse where
               | collecting files--any files--for themselves is the end
               | goal.
        
               | crazygringo wrote:
               | I don't know if it's always hoarding. A lot of stuff on
               | the internet simply disappears. Especially when it's
               | hosted by a single person, or an org that might not be
               | around in 3 years, or they get takedowns (whether legal
               | or not).
               | 
               | If you find something you'd like to keep accessing,
               | downloading it all in advance can be a smart move.
        
               | ghaff wrote:
               | So download specific items as the parent says (and as you
               | suggest). But reflexive "download it all just in case"
               | probably isn't helpful especially if it's just for you.
               | 
               | I'm not disagreeing with downloading stuff that you want,
               | even maybe site sub-sections. But a lot of the time it
               | turns into just doing a mass download.
        
           | marginalia_nu wrote:
           | Given it's mostly text, it's curious there's no content-
           | encoding applied to HTTP responses. Would reduce bandwidth by
           | something like 70-90% in most cases.
           | 
           | Though it's hard to say what's the best configuration without
           | understanding the hardware context.
        
             | bombcar wrote:
             | Above it's noted at being north of 300gb, most cheap cloud
             | providers will give you quite a bit of bandwidth for a
             | 300gb server.
        
               | marginalia_nu wrote:
               | Might be a point to self-host something like this. The
               | value of the project is its extreme longevity, and the
               | half-life of external hosting services is probably
               | something like 5 years.
        
         | surteen wrote:
         | That site hosts a lot of historical material that many "budget"
         | providers would probably balk at and kick you off their network
         | without much recourse. Having a good relationship with a
         | provider that knows you and will go to bat for you can cost
         | some money. That said, $100/month is not terribly expensive for
         | dedicated hosting.
        
           | Solvency wrote:
           | Couldn't you just host all of these files on Github under the
           | pretense of readme files? Would they throw a fit at this?
        
         | LukeShu wrote:
         | $ rsync --dry-run -a --stats
         | rsync://rsync.textfiles.com/textfiles ./textfiles.com
         | ...         Number of files: 866,935 (reg: 826,831, dir:
         | 40,083, link: 21)         ...         Total file size:
         | 319,889,987,473 bytes         ...
        
           | crazygringo wrote:
           | Ah ha, that explains it. Thanks.
           | 
           | Even on Digital Ocean, that would be $50 for the 500 GB of
           | block storage, another $20 for keeping a backup, and ~$6 for
           | the actual droplet for hosting. And then some extra bandwidth
           | costs, if people try to bulk-download. Adds up.
        
       | Solvency wrote:
       | 38 year old. I used to go on here as a teenager and even then I
       | felt like Gandalf pouring over a mysterious, powerful, and
       | ancient text resource. I was enamored by the "hacking McDonalds"
       | file and wished it were't obsolete even at the time of reading.
        
         | blooalien wrote:
         | https://www.grammarly.com/blog/pore-over-pour-over/
        
       | a_gnostic wrote:
       | Search for Michio Kaku
        
       | washadjeffmad wrote:
       | Between bash.org and textfiles, a big chunk of my youth was
       | archived through our logs and zines. Because of Jason, I'd hung
       | onto everything - logs, pictures from meetups, web designs,
       | source, and plenty of dumps, emails, news stories, and recordings
       | that covered a bit of the history behind some events around the
       | turn of the millennium with the intent of turning them over one
       | day, but lost pretty much everything in an unfortunate beer
       | accident in my 20s.
       | 
       | It didn't feel like counterculture, just that everyone else was
       | wrong, you know?
        
       | joshcsimmons wrote:
       | This is beautiful - thank you for posting this. I was jettisoned
       | back to my past immediately. I instantly recognize familiar
       | filenames browsing the site.
       | 
       | Seriously this made my day. I had completely forgotten about this
       | (it was a big part of my life...)
        
       | botto wrote:
       | I really like this talk Jason Scott did at defcon about being
       | sued for 2 billion dollars
       | https://www.youtube.com/watch?v=74g7wSTYUso
        
       ___________________________________________________________________
       (page generated 2024-01-21 23:01 UTC)