[HN Gopher] Link rot and content drift are endemic to the web
       ___________________________________________________________________
        
       Link rot and content drift are endemic to the web
        
       Author : timmytokyo
       Score  : 199 points
       Date   : 2021-06-30 13:15 UTC (9 hours ago)
        
 (HTM) web link (www.theatlantic.com)
 (TXT) w3m dump (www.theatlantic.com)
        
       | developeron29 wrote:
       | "The benefits of internet far outrides its shortcomings"
       | 
       | Stop being pessimistic
        
       | kylestlb wrote:
       | It's an Atlantic article so it's long, and several of the
       | comments here show that people aren't actually reading the whole
       | thing... but I did and it's worth the time. It's not only about
       | links being dead, it's about the lack of transparency & audit
       | when content is changed via takedown requests, it's about dead
       | links showing up in decades-old supreme court decisions, it's
       | about private industry's lack of incentive for wanting to improve
       | any of these issues, etc.
        
         | ergot_vacation wrote:
         | Having read the whole thing, it seems partially good, martially
         | misguided, and partially terrible.
         | 
         | The overall bent is a hand-wringing about link rot, which I
         | thought we mostly got over a decade ago. The Internet is
         | fundamentally ephemeral. If you see something you like, save it
         | so you can repost it later. If you rely on someone else to keep
         | in up indefinitely, you're being foolish.
         | 
         | Around the edges of that main discussion, the Atlantic also
         | touches on censorship in all the wrong ways, re-iterating the
         | too-common view that censorship is good as long as the good
         | guys do it. They at least argue that this censorship should be
         | transparent and censored works still accessible in some way,
         | but they seem to not understand the nature of what they're
         | talking about.
         | 
         | Censored works aren't censored to protect the public. They're
         | censored to protect the rich and powerful. That's why "right to
         | be forgotten" really exists. That's why Google and Youtube and
         | Twitter and Facebook quash anything that goes against the
         | accepted narrative in any given field. They aren't protecting
         | the public from dangerous misinformation. No one gives a shit
         | about the public. They're protecting the financial and
         | political interests of some very powerful people.
         | 
         | Given this, talk of a "poison cabinet" only illustrates
         | ignorance of the issue. The Memory Hole cannot be divorced from
         | the censorship process, it's a core part of it. If people can
         | still find the information in some form, it's not censored
         | enough to make the people it threatened happy.
         | 
         | And this leads to the final point, which is that the real
         | reason the web is "rotting" isn't link rot, it's censorship on
         | the part of tech monopolies, due to their joined-at-the-hip
         | relationship with every large corporation and industry
         | imaginable due to advertising and other deals. The fact that
         | links die doesn't matter much: you can just repost the
         | material. The fact that links ARE ACTIVELY KILLED to suppress
         | their information is a much more serious problem, and one that
         | doesn't have an easy solution besides full breakup of the tech
         | monopolies.
        
           | pfraze wrote:
           | I'm not 100% convinced by your assertion that censorship only
           | favors the rich and powerful. It can and often does, but it
           | can also help people without power or society and large. For
           | instance, taking down a dox for a niche YouTuber is clearly
           | not helping a powerful person, but it's still arguably
           | censorship.
           | 
           | The misinformation area is somewhat stickier, but here's a
           | decent example: if somebody decided to hurt you by spreading
           | rumors (let's say that you watch CP) and spends time and
           | money to get that rumor top of any SEO and forum thread,
           | what's the right course of action? How good are you going to
           | feel about using speech to counteract that when the result is
           | a Google search giving your denial in spot 1 and the
           | accusation in spot 2?
           | 
           | We all have to grapple with power and the ability to abuse
           | it, but I don't think it's effective to say power is
           | fundamentally wrong. The conversation is more nuanced than
           | that and has to be viewed as systems with checks on power,
           | which means specific design-thinking.
        
             | throwawayboise wrote:
             | > if somebody decided to hurt you by spreading rumors
             | 
             | We have libel laws to address that.
             | 
             | GPs point is that Google, Facebook, et. al. are premptively
             | censoring non-mainstream content just to protect
             | themselves. They don't really care about the public.
        
               | toiletfuneral wrote:
               | You're definitely right, but I was under the impression
               | they exist to generate profit for shareholders, not act
               | as a social service. Capitalism is a system for
               | maximizing value extraction, not altruistic endeavors.
        
               | BatFastard wrote:
               | Have you ever had to deal with the court system? It is
               | VERY slow, expensive as hell, and utterly frustrating.
        
               | NoboruWataya wrote:
               | > just to protect themselves
               | 
               | To protect themselves _from the public_. Whether it 's
               | because consumers might take their business (and their
               | data) elsewhere in disgust at what a particular platform
               | is turning into, or because democratically elected
               | lawmakers could start imposing sanctions or new
               | regulations.
               | 
               | Companies are always looking out for themselves, that's a
               | given. But that doesn't mean their actions are completely
               | divorced from public opinion.
        
         | [deleted]
        
         | ballenf wrote:
         | Maybe the reason it feels impossible to stem the tide of link
         | rot is that it's takes tremendous energy to constantly increase
         | the entropy of a system. And that energy has opportunity cost
         | that no one really wants to talk about.
         | 
         | The article has the unstated assumption that eternal
         | preservation of all writing ever is a net benefit. I think it's
         | worth having a discussion on that point.
        
           | kylestlb wrote:
           | I disagree that the unstated assumption is eternal
           | preservation of all writing. IMO the author is clearly
           | focusing on official or semi-official published information,
           | not necessarily what you and I write on places like HN.
        
             | bentcorner wrote:
             | And there's the irony that's pointed out in the article as
             | well - official documentation on government websites may
             | not last past an elected official's term, yet blithe
             | comments on a social media site that are later regretted
             | may last forever.
        
           | Sr_developer wrote:
           | > Maybe the reason it feels impossible to stem the tide of
           | link rot is that it's takes tremendous energy to constantly
           | increase the entropy of a system.
           | 
           | I suppose you meant "decrease"
        
       | makomk wrote:
       | It's amazing how little some trusted institutions care about
       | this. For example, the BBC has been bragging about how many
       | people rely on their coverage of the pandemic, but have an
       | obnoxious habit of repeatedly overwriting old articles with new
       | ones on similar topics and not keeping the old versions
       | available. The history of a once-in-a-century pandemic with huge
       | local and global impacts is literally being overwritten day by
       | day.
       | 
       | Sometimes this helps them whitewash their screw-ups which have
       | lead to widespread false beliefs. For example, after the UK
       | government targetted and hit 100,000 Covid-19 tests in a day, the
       | BBC ran an article falsely claiming Germany had achieved this a
       | month earlier and linked it prominently on their news front page
       | for about a month. A large proportion of the population probably
       | saw this and now falsely believe it, it got brought up all the
       | time as part of the narrative that the government's big "world-
       | leading" achievements were just playing catch up badly, but it
       | was memory-holed from the article in a rewrite and they used that
       | as an excuse for not publishing any correction - so unless
       | historians dig deep in third-party archives, they'd never
       | understand where that belief came from. (Apparently a previous
       | version of the article also wrongly claimed France was carrying
       | out more Covid-19 tests due to mistaking their weekly numbers for
       | daily one, according to a correction which disappeared from the
       | article after a few days and only exists in the Internet Archive
       | now. I haven't been able to find the original version of that
       | claim.)
        
         | tda wrote:
         | On the bright side, using a tool like Internet archive it
         | should be easy to filter out which articles were removed and/or
         | edited by the BBC, in a way highlighting the most historically
         | important articles.
        
           | deadalus wrote:
           | Wayback Machine censors many websites(like 4chan) from being
           | 'saved'. Wayback Machine also removes previously archived
           | videos/websites in certain cases. They are not neutral.
        
             | thrashh wrote:
             | Don't spread misinformation. The Wayback Machine is not
             | censoring 4chan. 4chan is 'censoring' the Wayback Machine.
             | 
             | https://www.4chan.org/robots.txt
             | 
             | Also, I let a domain of mine expire and the new domain
             | owner (which just plastered ads) had a robots.txt that
             | retroactively removed my "previously archived website" from
             | the Wayback Machine.
        
               | a1369209993 wrote:
               | > Don't spread misinformation.
               | 
               | No you.
               | 
               | > The Wayback Machine is not censoring 4chan. 4chan is
               | 'censoring' the Wayback Machine.
               | 
               | 4chan is not the operator of the server at
               | web.archive.org. Compliance with robots.txt restrictions
               | is 100% the Internet Archive's fault.
        
             | Zababa wrote:
             | Fortunately 4chan has (unofficial) archives, but some
             | content was probably lost.
        
           | throwawayboise wrote:
           | I mean yes, if you are extremely motivated. But the wayback
           | machine is pretty klunky and slow, honestly. And there's no
           | good "diff" view that summarizes the changes to a URL over
           | time AFAIK.
        
             | bobsmooth wrote:
             | Slow, yes. Klunky? Absolutely not. When you make a request
             | to the Internet Archive, you're searching through a massive
             | amount of data. The fact that it only takes a few seconds
             | to pull up a decade-old webpage is amazing.
        
             | jandrese wrote:
             | This is true, but they're also taking on an absolutely
             | monumental task on a shoestring budget. I continue to be
             | amazed at what they are able to accomplish. There aren't
             | many heroes on the internet, but the Internet Archive team
             | qualifies.
        
             | nitrogen wrote:
             | _But the wayback machine is pretty klunky and slow,
             | honestly._
             | 
             | This is very true. Sometimes it takes 5-10 seconds to load
             | the calendar view for an archived page, and another 5-10,
             | or more, to load a snapshot.
             | 
             | They have a _ton_ of data to manage with limited resources,
             | but it still seems it should be possible to go faster than
             | this. If there 's just not enough budget for I/O, maybe
             | they could offer a donate-for-data-dump option, where you
             | can donate in exchange for loading data of interest (say,
             | BBC archives) into a storage medium or query engine of
             | one's choice, so one could do research at a much faster
             | pace.
        
         | mountainb wrote:
         | The BBC publishes nothing but garbage and there are other
         | extant sources that are more durable. It's fine to forget
         | things. We're missing entire libraries of classical literature
         | from great authors which would be nice to have. Missing
         | documentary sewage isn't a tragedy.
         | 
         | This should show us that most of the web isn't worth preserving
         | anyway, much like McDonald's burger wrappers aren't worth
         | preserving like sacred artifacts. Most web and social media
         | content is worth less than said greasy burger wrappers.
        
           | undfg wrote:
           | I remember how calling the BBC garbage a few years ago got
           | your comment heavily downvoted here. They'd tell you that
           | they were the best thing since sliced bread and that they
           | were good because both the left and the right hated them, as
           | if that meant something. Now it seems everybody is
           | recognising the BBC for what they are: utter shite.
        
             | idrios wrote:
             | A few years ago, any comment that didn't add new insight to
             | a topic would get downvoted. I remember once reading a
             | comment where the response was a quip, and someone replied
             | "this response was funny but we don't want this site to
             | become Reddit so I downvoted you".
        
             | lez wrote:
             | I see this as a more general pattern on HN: Opinions not-
             | yet-adopted by academia are often downvoted instead of
             | being argued with. This stifles innovation because
             | alternative opinions do not even show up in the casual
             | reader's screen.
        
               | SuoDuanDao wrote:
               | Absent an explanation of why I've annoyed people, I get
               | as much of a dopamine hit from downvotes as upvotes. I'd
               | rather be polarising than forgettable.
               | 
               | Whenever I take an unpopular stance I remind myself of
               | Rick Sanchez's wise words, "Your boos mean nothing, I've
               | seen what makes you cheer".
        
               | Smithalicious wrote:
               | Me too, I think a lot of my most upvoted comments are
               | just truisms and preaching to the choir, whereas a lot of
               | the more insightful things I've said quickly get greyed
               | out.
        
               | skybrian wrote:
               | "Someone said it on Hacker News" carries no weight. Why
               | should anyone take our comments seriously if they don't
               | recognize the username? I don't see this as a bug.
               | 
               | Better to post links to trusted sources and let people
               | judge for themselves.
        
           | tooltower wrote:
           | What other more durable sources do you recommend?
        
           | CodeGlitch wrote:
           | Reading one of the BBC's technical articles, a cyber security
           | news item, they had 3 errors in the first paragraph. I didn't
           | bother reading to the end of the article.
           | 
           | I'm glad I no longer pay for a TV license.
        
             | dotBen wrote:
             | The BBC (News's) tech section isn't aimed at you.
             | Inaccuracies shouldn't be there but often they will dumb
             | down or gloss over stuff for the mainstream audience they
             | are aiming at.
             | 
             | You notice it cos you are in tech, but the same happens in
             | financial news, science and even sport. Go read a tech
             | publication.
             | 
             | For shits and giggle I did once try to get a technical
             | story on how to copy DVD's published - it got very heavily
             | edited!
             | http://news.bbc.co.uk/2/hi/science/nature/1987665.stm
             | 
             |  _(I 'm a former + early BBC News website employee)_
        
               | throwaways885 wrote:
               | Aside: That old version of BBC News is an absolute gem of
               | history. Especially looking at some of the recommended
               | sidebar stories:
               | 
               | > Britons 'baffled over euro rate'
               | 
               | > Wireless internet arrives in China
               | 
               | > Mobile spam on the rise
               | 
               | Fascinating to see how much our problems have stayed the
               | same, despite the changing context.
               | 
               | I hope this is considered 'archived' and not 'forgotten'.
        
             | jandrese wrote:
             | It has always been a constant of journalism that you read
             | an article in your field and go "Wow, this is terrible,
             | they got all of the details wrong". But then you turn
             | around and trust the reporting on everything outside of
             | your field of expertise.
        
         | throwaways885 wrote:
         | > so unless historians dig deep in third-party archives, they'd
         | never understand where that belief came from
         | 
         | I expect future historical tooling will exist to solve exactly
         | this problem. Assuming Archive.org and the like nabbed it, the
         | evidence is all there for future generations to see.
        
           | pessimizer wrote:
           | Assuming archive.org isn't shut down and deleted by court
           | order in some future lawsuit.
        
       | FartyMcFarter wrote:
       | Donate money to archive.org! Some of the FAANG companies match
       | donations to it too.
        
       | wintermutestwin wrote:
       | I am increasingly worried about the valuable content on YouTube.
       | There are so many old live concerts, useful how-to videos and
       | other cultural treasures amidst all the junk. I suspect that one
       | day, they will make their ads unblockable by embedding them in
       | the video files. I sure hope that some people are downloading the
       | valuable stuff and stashing it away to load onto YouTube's
       | successor.
       | 
       | <and please skip the tired argument that I should just pay a
       | subscription fee to avoid their ad crap - we are already paying
       | them with our data. My data is worth far more to me than the
       | value they provide for it. Plus - I won't give money to a company
       | who forced this Faustian bargain on me.>
        
         | ChrisArchitect wrote:
         | you're worried about....the ads? Having ads around doesn't make
         | the content any less valuable. We're talking about the content
         | itself still being available, who cares if there's some ads
         | keeping the system up if all those live concerts and how-to
         | vids are preserved forever
        
           | wintermutestwin wrote:
           | Who is going to watch live concert footage with ads jammed
           | into it? For those of us who grew up in the age of TV,
           | advertising was clearly a slippery slope where they
           | constantly increased the ad content until it was beyond
           | unbearable (they even deleted parts of shows to make room for
           | more ads!) YouTube will likely do the same when they decide
           | to force all of us to watch unblockable and unskipable ads to
           | access content that they didn't even create or curate. All
           | they have done is provide a network effect monopoly that
           | hoovered up the majority of content.
        
           | gentleman11 wrote:
           | The ads make content less valuable. They distract and mislead
           | and manipulate emotions, especially when the youtuber
           | sponsors something during the video itself. The content isn't
           | abstract and isolated, the content and the ads come as a
           | bundle. The digital procedural product placement on the way
           | will make this 10x worse
        
         | redisman wrote:
         | It's really up to you who cares about something to archive it.
         | I managed to find a torrent of early days video games from my
         | region that has almost been lost to time. Luckily I found a
         | discord and could coax someone to hop on to seed it. If I had
         | waited 10 more years they might have been gone for good.
         | 
         | Everyone's assuming that data now stays on the internet forever
         | because it's so massive. It's usually one or two people who
         | keep the flame alive
        
         | robotnikman wrote:
         | I set up a server specifically for downloading Youtube videos
         | from all my playlists on a daily basis just for this reason. At
         | some point I got fed up at seeing all the missing videos on my
         | playlists (and not even knowing what was removed)
        
         | majormajor wrote:
         | How many how to videos, memes, concerts, etc, are really that
         | important?
         | 
         | In fifty years how many people will care? How many people
         | _should_ care because it would mean ignoring the huge volume of
         | newer stuff? A hundred years? Two hundred?
         | 
         | I haven't even read or seen many of the existing cultural
         | artifacts we have from past decades and centuries, what would I
         | do if orders of magnitudes more of them had been preserved?
         | 
         | In fact, I'm incredibly greatful that I grew up _before_ all
         | the random shit that I threw out there as part of my youth was
         | subjected to obsessive cataloging and archival efforts.
        
           | codingdave wrote:
           | If you grew up before the internet, surely you remember all
           | the DIY manuals that lived in everyone's home. Everyone had
           | that same big hardbound book of how to do basic home repairs.
           | Many people had sewing books, electrical books, Chilton
           | manuals for their cars. Cookbooks, too. We have valued how-to
           | information for decades, and that interest and need far pre-
           | dates the internet.
           | 
           | So while I get what you are saying that much of the pop
           | culture videos do not hold long-term value (which is also
           | questionable considering how many of us older folk still have
           | collections of vinyl)... there absolutely is valuable content
           | out there that deserves preservation.
        
             | throwawayboise wrote:
             | There will probably always be niche sources for specific
             | "how to" information. YouTube is currently a low-effort way
             | to make that generally available in video format, but it's
             | certainly not the only viable solution, and the information
             | you can get from a specific enthusiast site or forum is
             | often better and more detailed.
        
             | majormajor wrote:
             | Sure, and nobody is going to the library or other archives
             | to check out those old how-to books, they're using
             | contemporary sources on Youtube instead.
             | 
             | And the same will be true of old vs contemporary sources in
             | fifty years.
             | 
             | I don't expect my own collection of books - which includes
             | some that are valuable to me primarily for nostalgia - to
             | have much value past the death of myself and the rest of my
             | generation. It might temporarily have a lot of monetary
             | value near my death - when other copies have already been
             | lost - but to someone born fifty years after me? What use
             | would pulp fiction from the 80s be to many of them?
             | 
             | My parents and uncles are in a bit of disbelief of how
             | _little_ even I care about the Beatles already, after all.
        
               | BatFastard wrote:
               | Well not caring for the Beatle is like not caring for
               | Mozart. Bad taste is always an option in a free society.
        
               | majormajor wrote:
               | Even people who care about Mozart's music largely have no
               | idea how much other stuff from the time period they might
               | have enjoyed that was lost, and that hypothetical loss is
               | not ruining their life at all.
               | 
               | We don't live dramatically longer or have dramatically
               | larger memories than our ancestors, so things necessarily
               | have to get lost and replaced by the new things that have
               | been created since then.
        
           | deeblering4 wrote:
           | These subjective questions are pointless to try and answer.
           | 
           | The idea is that a future person could freely deep dive
           | through a rich well indexed history of media about whatever
           | specifically interests _them_
           | 
           | I wish people would stop trying to assess the value of a
           | given piece of media and just tag and archive the stuff.
           | 
           | For instance, high quality footage of live music from 100
           | years ago would be very interesting to some.
        
           | NoboruWataya wrote:
           | Each individual meme or video might not be important, but
           | then I don't think future historians are going to spend much
           | time studying individual artifacts in detail in the same way
           | that current historians do. We live in the age of big data
           | and I think future historians will be focused on aggregating
           | and automatically analysing that data. They probably won't be
           | reading your comment or mine but they might analyse large
           | sets of HN comments with a view to drawing conclusions about
           | how particular demographics act, think and feel today. And if
           | large swathes of those comments are lost the conclusions will
           | be skewed, particularly if the loss is not random.
           | 
           | Our society is characterised by the constant generation and
           | exchange of massive amounts of information. It's one of the
           | things that sets us apart from previous generations.
           | Preserving only a small subset of that data that we deem
           | worthy or important will not allow future generations to
           | fully understand today's society.
        
           | krapp wrote:
           | >I haven't even read or seen many of the existing cultural
           | artifacts we have from past decades and centuries, what would
           | I do if orders of magnitudes more of them had been preserved?
           | 
           | You might have a better understanding of the culture that
           | produced them. You might appreciate a work of art that would
           | otherwise not exist. We have graffiti from Pompeii, we know
           | Ea-nasir sold cheap copper in ancient Ur 3700-odd years ago,
           | but we've lost countless works of literature, music and film,
           | some by the greatest masters of their age. What artifacts of
           | culture survive the scouring sands of time is often a matter
           | of happenstance, rather than quality.
           | 
           | Chances are almost everything our species has produced
           | culturally, scientifically and artistically - the whole
           | corpus of our knowledge output over the last century - is
           | going to vanish within a generation or two anyway, simply
           | because the digital foundation into which we've transferred
           | so much of it is brittle and ephemeral. If we want to leave
           | anything behind for future generations at all besides climate
           | change, pollution and nuclear waste, we should save as much
           | as possible rather than only what we consider to be relevant.
        
             | Forbo wrote:
             | Minor nitpick, it was ~1750 BC, so that would be around
             | 3700 years ago.
        
               | krapp wrote:
               | Oops, fair enough.
        
           | mdoms wrote:
           | You don't know what's important until much later. That's what
           | makes archiving difficult.
           | 
           | Also, important to whom?
        
         | toomuchtodo wrote:
         | https://github.com/bibanon/tubeup
        
         | gentleman11 wrote:
         | They won't stop collecting your data even if you pay them
        
         | tenebrisalietum wrote:
         | > I am increasingly worried about the valuable content on
         | YouTube.
         | 
         | Download it? This continues to be not difficult for YouTube.
         | 
         | > I suspect that one day, they will make their ads unblockable
         | by embedding them in the video files.
         | 
         | That's fine. I honestly wish they would because most of the
         | hangs in YouTube I experience when the stream changes to an ad,
         | and then changes back. If it was embedded in the video then the
         | stream wouldn't be interrupted.
         | 
         | If I hate the ads that much one can edit them out after it's
         | downloaded.
        
         | blooalien wrote:
         | > "and please skip the tired argument that I should just pay a
         | subscription fee to avoid their ad crap - we are already paying
         | them with our data. My data is worth far more to me than the
         | value they provide for it. Plus - I won't give money to a
         | company who forced this Faustian bargain on me."
         | 
         | Not only this, but many of the largest companies these days
         | would _never_ remove advertising even from a paid service. It
         | 's like cable TV. They wanna charge you _and_ advertise at you
         | for _more_ money.
        
           | anoncake wrote:
           | Paying signals that you have disposable income, making
           | advertising to you more profitable.
        
         | [deleted]
        
         | varispeed wrote:
         | I download everything I like. Storage is cheap now.
         | 
         | Plus smaller sites will start disappearing because of
         | regulatory capture. It will not be possible to run a forum or
         | similar site in few years.
        
           | pessimizer wrote:
           | We say that a lot, but it's not that cheap if you're using
           | redundancy and backups. It's cheap if you don't care too much
           | about the data.
        
             | Thrymr wrote:
             | Cataloging and indexing and searchability are also not
             | cheap if you are doing all of that on your own time.
        
         | sp332 wrote:
         | The 2014 Vulture interview with David Milch, where he reads
         | form an unreleased Boss Tweed script, may be gone for good.
         | https://twitter.com/mattzollerseitz/status/14096229692828753...
        
         | a1369209993 wrote:
         | > that I should just pay a subscription fee to avoid their ad
         | crap
         | 
         | Fuck no. _Do not do this_. Use youtube-dl[0], and maintain
         | local copies of anything useful you can find.
         | 
         | 0: http://youtube-dl.org/
        
       | bluedays wrote:
       | Project Xanadu was an attempt to fix this, but unfortunately it
       | suffered from the creator wanting to capitalize on it's success.
       | Which is, I think, why it ultimately failed.
        
       | okareaman wrote:
       | Buddhists chuckle at the notion of permanence and go back to
       | constructing sand mandalas
        
         | kragen wrote:
         | Buddhists have preserved most of the Tripitaka for 2500 years,
         | and for the first 500 years it was memorized and transmitted
         | orally from generation to generation. Buddhist monks today
         | spend significant amounts of their time memorizing parts of it.
         | Printed, it's about 12000 pages; it's been translated into many
         | languages, but not all of it has been translated into English
         | yet. Thanissaro Bhikkhu has been working on it for 20 years,
         | publishing his translations under CC-BY, and may finish the job
         | before he dies. Aside from its value to devotees, the Tripitaka
         | is one of our best historical sources about everyday life in
         | South Asia 2500 years ago.
         | 
         | The invention of wood block printing 1300 years ago in the Tang
         | was apparently specifically motivated by the desire to preserve
         | and reproduce Buddhist sutras; the oldest surviving documents
         | printed with movable type, from 900 years ago, are also
         | Buddhist texts.
         | 
         | Of course the Tripitaka is not permanent; it will be lost some
         | day. But you seem to be implicitly claiming that Buddhists do
         | not apply effort to preserving information and in particular
         | textual records, because they know that ultimately they will be
         | lost. In fact, the truth is quite the opposite, and believing
         | your implicit claim would require almost complete ignorance of
         | Buddhism, printing technology, and South Asian classical
         | studies.
        
         | datameta wrote:
         | I know your comment is probably tongue-in-cheek but I must say
         | (without disagreeing, regardless) that perhaps modern day
         | digital infrastructure can transcend the considerations early
         | buddhists may have had for the natural world and then-
         | contemporary human-built structures.
        
           | okareaman wrote:
           | I used to be a data hoarder but I learned to let it go. I
           | save the important stuff, just like the Buddhist monks do.
           | How much of the internet is really worth saving? What will
           | Geocities mean to anyone 50 years from now? The internet is a
           | dynamic process evolving in real time.
           | 
           |  _No man ever steps in the same river twice, for it 's not
           | the same river and he's not the same man_
           | 
           | Heraclitus
        
             | datameta wrote:
             | I am a fan of that quote and I now personally eschew
             | clutter. However I'm referring to the examples in the
             | article such as the supreme court justice referencing links
             | that no longer existed. Paraphrasing the article, >75% of
             | links from the 90s are defunct. Sure Geocities may not have
             | value to many, but an astonishing number of links in court
             | rulings and law documents are leading to dead ends. I can
             | see how this could lead to shaky ground upon which it would
             | be more difficult to defend certain internet freedoms.
        
               | okareaman wrote:
               | Google's recent invention of text links should help this:
               | 
               | https://en.wikipedia.org/wiki/Filler_text#:~:text=%22Now%
               | 20i....
               | 
               | # signifies an anchor
               | 
               | :~:text= signifies a text link
               | 
               | %22Now%20is%20the%20time%20for,21%20(1918). says show me
               | the text between "Now is the time for" and "21 (1918)."
        
           | beaconstudios wrote:
           | buddhists fundamentally reject clinging to the idea of
           | permanence as a source of inevitable misery when your wishes
           | go unfulfilled.
        
             | okareaman wrote:
             | I think technology will be invented to keep track of
             | everything and their associations even if the link becomes
             | dangling. It's an obvious problem to work on.
             | 
             |  _You only lose what you cling to_
             | 
             | Gautama Buddha
        
               | beaconstudios wrote:
               | the problem is that doing so would require storing a copy
               | of the entire subset of the internet that you choose to
               | persist, which requires you to either choose a small
               | subset, or pay huge storage fees!
        
               | okareaman wrote:
               | I just bought a 2 terabyte drive for $55, which is mind
               | boggling to an old salt like me. The need to store
               | enormous amounts of real time data will keep driving
               | storage advances. Text and image data may turn out to be
               | a trivial percentage of the overall storage needs in the
               | longer term.
        
         | jl6 wrote:
         | Accepting the impermanence of all things is an important
         | lesson, but it is not an excuse for nihilism, because even
         | impermanent things can have value, however finite, and the
         | impossibility of true permanence doesn't have to distract from
         | realising finite value while it lasts.
        
         | Santosh83 wrote:
         | Indeed yes, however much we may dislike it, change is the only
         | constant. Of course that doesn't mean we shouldn't bother
         | archiving, but there is no need to fret over saving every byte
         | out there on the web.
         | 
         | Entropy is king. Eventually all information loses its
         | coherency.
        
       | JKCalhoun wrote:
       | I expected the article to deal more with the rotting _landscape_
       | of the internet, the rotting of our choices of content, the poor
       | selection of links on the 1st page of any search engine.... I am
       | less concerned with the fact that a link breaks than I am with
       | what it says about the content that was there, is no longer
       | there.
        
         | rchaud wrote:
         | Links are the backbone of the internet. Archive.org is a huge
         | asset, but it relies on individuals being prescient enough to
         | archive pages that might be lost to time. That's not scalable.
         | Plenty of people will visit Wayback Machine to pull up an old
         | page that's gone to the big 404 in the sky, but they won't
         | actively submit links to archive themselves.
         | 
         | The bad design and low quality content is a symptom of the
         | Internet's broken underlying economics. That's a human problem,
         | not a tech problem.
        
       | bullen wrote:
       | The article does not scroll on an older Chrome... the internet is
       | indeed rotten.
        
         | bencollier49 wrote:
         | Or on Firefox.
        
       | fleddr wrote:
       | It's not a technology problem, it's an incentive problem.
       | 
       | Had the web somehow been centralized (I have no idea what that
       | would even look like), content still would not be archived, it
       | would be constantly changed, and subject to censorship. Just like
       | in a decentralized web, perhaps even more so.
       | 
       | Archiving costs lots of money (and costs keep growing if you only
       | add and never take away), can be highly challenging (in the case
       | of web apps or complex dependencies), whilst providing zero
       | immediate reward for the organization carrying this heavy load.
       | Not only is there no incentive, many couldn't even afford to if
       | they wanted to.
       | 
       | And it gets worse still. Digital archiving means paying forever.
       | Imagine paying a 100 years of electricity, hardware replacements,
       | migrations. The entity (business, person) is long gone before
       | that.
       | 
       | As a ridiculous example of this: Facebook has several very large
       | idle content data centers. Mega scale buildings full of servers
       | storing photos of Facebook users they haven't accessed in years,
       | and likely never will again. Yet should a user do this, they
       | expect the photo to still be there.
       | 
       | That's why I believe the problem should be addressed with more
       | pragmatism. Focus on things of unquestionable long term value,
       | and think of a good solution for this smaller scope.
        
       | CountDrewku wrote:
       | This is why there are sites like WayBackMachine. I suggest people
       | keep donating to them as well if they want to preserve internet
       | history.
        
       | MomoXenosaga wrote:
       | Visiting an old forum and all the pictures will be gone because,
       | surprise, free image hosting doesn't make economic sense.
        
         | Robotbeat wrote:
         | The web forum that I frequent most,
         | https://forum.nasaspaceflight.com , has a policy of not
         | allowing embedded images but requiring them to be attached on
         | each post. The forum has been active for well over a decade now
         | (site was founded in 2004) and has a thriving community that
         | continues to grow. It is the community that keeps the forum
         | alive instead of just one random company (although it
         | technically is a company). This helps prevent link rot. Forums
         | seem anachronistic in 2021, but they have massive benefits
         | versus the gigantic platforms like Facebook or Reddit or
         | Twitter, and the quality of the discourse and analysis is far
         | higher. There are a few ads, but they're very unobtrusive.
         | 
         | (Also, Twitter posts are often linked, but usually the text is
         | copied for archival purposes.)
         | 
         | My experience with Wikipedia, forums such as those, and Wayback
         | Machine and arxiv.org make me think that people will do a lot
         | of stuff basically for free and that by building communities,
         | you don't need extremely clever trustless incentive systems
         | like Blockchain or major paywalls (although granted, the forum
         | does have something like a paywall for unverified pre-public
         | info) or massive platforms with multi billion dollar companies
         | in order to disseminate information, analysis, news, etc. Best
         | practices of web forums from the 2000s (active moderation, a
         | sense of common purpose, expectations of non-toxicness, etc),
         | are a really good solution.
        
           | [deleted]
        
       | uniqueuid wrote:
       | By the way, the technical side of this is very interesting. If
       | you look at the tools mentioned (the wayback machine, but also
       | perma.cc and other archival solutions), almost all of them rely
       | on a single semi-modern tech stack that produces WARCs (web
       | archives - ISO - ISO 28500:2017 https://iipc.github.io/warc-
       | specifications/specifications/wa...).
       | 
       | The main crawler still seems to be heritrix3
       | (https://github.com/internetarchive/heritrix3), but there's a
       | great little ecosystem with tools such as webrecorder and
       | warcprox.
       | 
       | Still, I've read through the code of these tools and am feeling
       | that they are failing in the face of the modern web with single
       | page apps, mobile phone apps and walled gardens. Even newer
       | iterations with browser automation are getting increasingly
       | throttled and blocked and excluded from walled gardens.
       | 
       | Perhaps the time has come for a coordinated, decentralized but
       | omnipresent approach to archival.
        
         | twobitshifter wrote:
         | I think "right to be forgotten" is important and I'm generally
         | against everlasting social media posts, but for copyrighted
         | works, we really need a centralized Library of Congress that
         | acts to archive these. In order for that to happen there needs
         | to be an equivalent "publishing" mechanism for the web - where
         | the user says - I created something and I want it to be
         | archived. This would cover things that exist behind a paywall
         | or are only delivered as newsletters.
        
         | black_puppydog wrote:
         | > increasingly throttled and blocked and excluded from walled
         | garden
         | 
         | I keep thinking back to Jacob Applebaum's stance of "facebook
         | and the other walled gardens are the real dark web."
        
         | PaulHoule wrote:
         | WARC can record and replay single-page apps, but it struggles
         | with knowing where a "page" begins and ends.
         | 
         | There was a time when I was furious with the web going to hell
         | and I investigated the possibility of "web without browsers"
         | that started with making a WARC capture of page and putting
         | pages through extensive filtering and classification before the
         | user sees anything.
         | 
         | With interactive capturing you can push a button to indicate
         | that a page is done "loading" but with automated capturing you
         | can't really know that the page is done or that you got a good
         | capture. That ended the project right there.
        
           | glasss wrote:
           | I'm completely out of the loop on something like this, but
           | could you in theory apply some kind of ML to identify the end
           | of pages to assist with good page captures?
        
             | PaulHoule wrote:
             | Probably. Certainly the more you spent on it the better you
             | could do.
             | 
             | At the time I was most bothered by the slow load times of
             | web pages and blaming this phenomenon:
             | 
             | https://www.sjsu.edu/faculty/watkins/samplemax4.htm
             | 
             | particularly that if you take the max of N random
             | variables, the expectation value you get gets worse as N
             | increases -- that is, the page isn't done loading until the
             | slowest http request completes.
             | 
             | So I saw the "knowing when the page is done" problem as
             | being particularly core, and it would be if the goal was to
             | "win the race" against a conventional web browser.
             | 
             | If you were (say) preloading all the links submitted to
             | hacker news you might be able to tolerate the system taking
             | 5 minutes to process an incoming page. (See archive.is)
             | 
             | Today I've noticed that sites like Wired are giving up on
             | complaining about my anti-track and ad-blocker and they
             | just load the page partially which would drive me crazy if
             | I was serious about debugging.
        
         | shadowgovt wrote:
         | Honestly, it would be a better use of surplus resources than
         | crypto mining.
         | 
         | If only there are a way to algorithmically tie a proof of work
         | for a new cryptocurrency to archival of the internet in a way
         | that wouldn't be easily gamed (by people archiving easy to
         | access content or highly redundant archival of trivia).
        
           | meowkit wrote:
           | https://spec.filecoin.io/algorithms/pos/
        
       | have_faith wrote:
       | Knowing that it decays is what prompts us to try and save the
       | bits worth saving.
       | 
       | I don't sit in the camp that everything digital must be preserved
       | and that it's a disaster if it isn't. I try not to fight entropy
       | in it's many manifestations. It's a shame when content disappears
       | but I think it's also healthy to just accept it. We tend to only
       | frame information disappearing in a negative light because we can
       | always imagine a scenario where that information could have been
       | valuable to someone, and that is a valid concern, I just don't
       | think it's helpful to view it as the internet going into some
       | downward rotting spiral and therefore every single 0 and 1 must
       | be preserved.
       | 
       | The major problems of the internet seem to be almost entirely
       | cultural currently.
        
         | Robotbeat wrote:
         | Ironically, deleting stuff on the web is technically REDUCING
         | its entropy.
        
       | debt wrote:
       | I always thought something like Ethereum could solve this type of
       | thing; that is, if the content itself lived inside the
       | blockchain. Obviously for larger formats that wouldn't work, but
       | for many text based or lower resolution image formats, it
       | wouldn't be too much overhead to just inject it all into the
       | blockchain.
        
       | fridif wrote:
       | Here's an idea, try to make things that need to survive into
       | static downloadable content.
       | 
       | "But I'm the consumer of a service!"
       | 
       | --Ask the service provider to open source their work.
        
       | jessehattabaugh wrote:
       | I think the publishing media that predate the web all had the
       | same problem; posters get torn down, newspapers burn, even stone
       | carvings weather. Hardly a new, or possibly even unnecessary
       | phenomenon.
        
       | Zamicol wrote:
       | Hashes are the answer.
       | 
       | How they get implemented in solving this problem is the question
        
       | neonate wrote:
       | https://archive.is/Trpkg
        
       | platz wrote:
       | Fungi serve an important role in nature.
        
       | canadianwriter wrote:
       | I actually implemented a rule for my website: anytime I write
       | anything and cite a link, I always also include the internet
       | archive url as well just in case. If it's not been archived yet I
       | submit it to be.
       | 
       | as an example:
       | 
       | "You don't have to trust me on this one, here's an article with
       | [a bunch of data] | [*Archive link in case of link rot]"
       | 
       | from: https://kolemcrae.com/notebook/virtue.html
       | 
       | It's not perfect, but it helps reduce some of the issue.
       | 
       | Other than that solutions are incredibly hard to come by - you
       | need institutions to preserve urls - through tech changes and the
       | like, when they have very little incentive to do so. Eg. making
       | sure they implement a redirect from the http to https sounds
       | simple enough, but not everyone did it. Also if they switch CMSs
       | and the like.
        
         | a1369209993 wrote:
         | > a rule for my website
         | 
         | Note that you should also have a rule to save the link content
         | locally, to avoid single-point-of-failure problems in the
         | unlikely-but-catastrophic case that archive.org itself goes
         | down. (Cf the attempts to attack them over their National
         | Emergency Library programme last year.)
        
       | dang wrote:
       | I've changed the baity title to a representative phrase from the
       | article body [1], but it is maybe a bit too narrow now, relative
       | to what the article is really about. Suggestions for a better
       | title are welcome. Sometimes we use the HTML doc title but "The
       | Rotting Internet Is a Collective Hallucination" is worse!
       | 
       | [1] That's the best way to get a better title.
       | https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
        
       | FridayoLeary wrote:
       | But at least it's apparent that an end must be in sight for
       | internet as we know it. That there's still other forms of
       | internet waiting to be discovered.
        
       | ericra wrote:
       | For those hitting the "monthly article limit" paywall, this can
       | always be bypassed on The Atlantic by opening the link in a
       | private window.
        
       | abeppu wrote:
       | I feel like it's been quite a while since I saw anyone talking
       | about IPFS, but I used to hear about it not infrequently. I don't
       | know if this is because it was too nerdy, or too misaligned with
       | the incentives of most organizations (it's scary for some to not
       | be able to unpublish), or because it comes with some privacy
       | sacrifices, or perhaps it just matters less when hardly any page
       | is actually static any longer.
       | 
       | But, for guarding against "published", supposedly static material
       | disappearing, or changing silently, and for removing a short list
       | of organizations from being responsible for preserving content,
       | IPFS or something like it seems well-suited. Anyone who cares to
       | preserve something can. Any change is noticeable.
        
       | budibase wrote:
       | I think it's important to support tools that are driven with
       | people in mind, rather than money. I like where Brave is going:
       | https://news.ycombinator.com/item?id=27593360
        
       | phendrenad2 wrote:
       | I think out obsession with copyright and attribution of ideas
       | isn't helping. Many times you'll see people reference a page or
       | PDF, which long ago became a broken link. Not one person bothers
       | to paraphrase or copy relevant sections from it. And the wayback
       | machine can't cover everything.
        
       | tgbugs wrote:
       | The whole architecture of the internet is inside out. People have
       | become numb to the insanity of encountering null pointers
       | multiple times per day. This is understandable since the inside
       | out structure is what allowed the web to grow quickly, but it
       | will also be what ultimately dooms it as a real lasting store for
       | knowledge.
       | 
       | The problem is that the foundations are shifting sands, and we
       | need something that has significantly more integrity at the
       | bottom layer, we can't just bolt URNs on as an afterthought. Some
       | organizations are able to maintain persistent data over time, but
       | it is in spite of the technology, not because of it.
       | 
       | I will also note that a world where it is possible to delete
       | things is a world where individuals can be made to have written
       | anything in the past. On the internet, at a certain point the
       | past can be fabricated from whole cloth.
       | 
       | edit: and ironically, the issue is that this is because the
       | internet wasn't actually academic enough in its original design.
        
         | ganzuul wrote:
         | We should have kept developing Usenet. Handing control over to
         | web browser providers was a mistake.
        
       | samatman wrote:
       | This is why I think Twitter made a mistake in using the normal
       | suspension mechanism to ban @realDonaldTrump.
       | 
       | Not passing judgement on the decision to take away his posting
       | privileges. But by suspending POTUS, everything he posted during
       | his term in office is just... gone. Every hot link to anything he
       | said, on any website, is broken.
       | 
       | This is an enormous loss to any historian of the era. He was
       | using Twitter as his main microphone to speak to the world, and
       | all that content is, while not _lost_ lost, thoroughly and
       | permanently scrambled.
       | 
       | It would have been better to just lock him out of the account,
       | publish a statement that the @realDonaldTrump account is now
       | permanently archived, and that any new account he tried to open
       | will be suspended.
        
       | RGamma wrote:
       | This is why I'm building/curating my personal archive with stuff
       | that I think may be worthwhile saving (not only for myself
       | necessarily).
       | 
       | Perhaps there will be many personal archives like mine that one
       | day can be shared in a similar vein to copy parties.
       | 
       | We will need to treat the information we find online with its
       | impermanence in mind (as authors, making things easy to copy, and
       | consumers, copying stuff).
       | 
       | Perhaps it is this mindset that, when sufficiently prevalent,
       | could make the internet more like a library again; weed out the
       | garbage und curate the nuggets.
       | 
       | Btw I think archive.org is doing God's work but I don't believe
       | any amount of coding and crawling will be able to save everything
       | (nor should it). It can capture some raw data for (future AI?)
       | historians to sift through though.
        
       | ladyattis wrote:
       | I feel that the Internet as an archive isn't really feasible. At
       | best, it can augment existing archival efforts such as public
       | libraries. The fact people keep pushing off to webhosting what
       | should be put into a library is a grave misunderstanding of the
       | use cases for the Internet.
        
       | overgard wrote:
       | I doubt people will ever care about this enough for it to have
       | momentum, but there are well known technological solutions:
       | content addressable file storage. If you do that the url is
       | always tied to the file content itself. Of course this requires
       | documents to actually be documents. So I don't think it works for
       | any modern business model.
        
       | Justin_K wrote:
       | I would simplify it as SEO rot has ruined the internet.
        
       | djoldman wrote:
       | > People tend to overlook the decay of the modern web, when in
       | fact these numbers are extraordinary--they represent a
       | comprehensive breakdown in the chain of custody for facts.
       | 
       | This is a particularly good quote to sum up the article. The
       | internet is not a repository of facts, it is a repository of
       | facts, spam, junk, and _things_. Moreover, it is not the only
       | repository of these.
       | 
       | Link rot happens. Content is subject to the will of the publisher
       | to spend the time and/or money to continue to host it.
       | 
       | Depending on links to work eternally is a mistake. The problem is
       | not the link rot, it is the bad assumption.
        
         | tenebrisalietum wrote:
         | This is why when I see something I like on a website, I might
         | bookmark it, but I'm also saving it locally.
        
         | PaulHoule wrote:
         | It's depressing.
         | 
         | I know somebody who started a business that was successful for
         | a while and then failed. Spammers got control of the domain and
         | now it is full of ads for a dangerous diet drug.
         | 
         | What makes my blood boil is that it impugns the integrity of
         | the founder who is a decent person who has nothing to do with
         | that scam.
        
           | stickfigure wrote:
           | This actually happened to me:
           | 
           | https://www.voo.st/
           | 
           | We shut down the business an abandoned the domain. Someone
           | registered the domain, created a similar-looking website by
           | hand (recycling a lot of the text and images), and added
           | spam. It even has my old company address.
           | 
           | This is a .st domain, about $35/yr. The web design work cost
           | something too. More than I would have expected the link juice
           | from a single website to be worth.
           | 
           | What we really need is some sort of DNS record or meta
           | content we can add that tells search engines "this domain is
           | being abandoned, destroy all link juice".
        
             | PaulHoule wrote:
             | The traffic you get from people following links is
             | measurable (and real!) The traffic you get from "link
             | juice" is imagined.
             | 
             | The original PageRank paper assumed that PageRank
             | approximated the distribution of views on web pages
             | assuming that people followed links at random.
             | 
             | If Google wanted to know what people are viewing today,
             | they don't need to collect a link graph and do matrix math.
             | They can measure it directly with Chrome, Google Analytics
             | and data exhaust from the advertising platform.
        
           | JKCalhoun wrote:
           | Domains, if not outright sold by the owner, should die then.
        
             | junon wrote:
             | Or ICANN can create policies about domain squatting like GP
             | described.
        
               | PaulHoule wrote:
               | Or the FDA could put alternative-health scammers in jail.
               | 
               | Back in the 1950s they put Wilhem Reich in jail, where he
               | died. L. Ron Hubbard got the hint and left the country
               | and when no country was safe he went to sea.
               | 
               | Today people like Dr. Oz run alt-health scams
               | continuously and nobody seems to go to jail or even get a
               | fine.
        
             | Wowfunhappy wrote:
             | And never be possible to register ever again? I feel like
             | most easy-to-type domains would have been permanently
             | expended in the early days of the web.
        
               | PaulHoule wrote:
               | You wish there was some way you could make the links go
               | away...
        
           | jl6 wrote:
           | Nodes in keywordspace don't die with the businesses that
           | created them. The popularity of domains and links and words
           | and phrases are permanently altered by the existence of the
           | business. It's a digital footprint like how a real-world
           | business leaves a physical footprint. Some footprints are
           | harmless - just a memory of activity that once happened.
           | Other footprints cause lasting harm, like contaminated soil.
           | 
           | Abandoned formerly-popular domains create a kind of long-tail
           | info-environmental impact, just like an abandoned warehouse
           | can become a real-world hazard.
           | 
           | Maybe we need a digital superfund process.
        
         | ergot_vacation wrote:
         | Exactly. The Atlantic author seems to be laboring under the
         | misguided assumption that the web is somehow the same sort of
         | thing as a library of books. Even libraries often have some
         | degree of garbage information in them, and represent a survival
         | story: the vast majority of books ever written are no longer in
         | print, or even discoverable anywhere.
         | 
         | Good stuff should be preserved, but it's not the Internet's job
         | to somehow magically do it. It's OUR job, and the nature of
         | digital information (DRM not withstanding) makes this easier
         | than ever.
        
           | pfraze wrote:
           | Somehow I'm replying to you twice, but this time I agree and
           | wanted to note that libraries are curated spaces as well.
        
       | chovybizzass wrote:
       | yes. i am down to HN and Reddit. Google calendar and telegram. I
       | don't even know how to find cool stuff online anymore. Google
       | SERPs are all business driven now unless you're research news.
        
         | throwaway_egbs wrote:
         | The article is about link rot, not cultural rot.
        
           | shadowgovt wrote:
           | And it seems to overlook a pretty straightforward question:
           | in the era of the search engine, how much of an issue is link
           | rot?
           | 
           | I've hit bad links before. Four out of five times, I can do a
           | general search for the title of the document that should have
           | been at the link or the quoted excerpt that the document I'm
           | reading pulled from the link, and I get a clone of the
           | document posted somewhere else.
        
         | ct0 wrote:
         | All the cool stuff has gone back to IRL. Once people stopped
         | making potato gun websites, the internet really stopped
         | blossoming into an amazingly vibrant space. I would recommend
         | hackaday.com because it hasn't changed in quite some time.
        
           | jabl wrote:
           | FWIW, Siemens recently bought hackaday (or well, Siemens
           | bought a company called Supplyframe which was the owner of
           | Hackaday), so lets see how long that will last..
        
           | krapp wrote:
           | >Once people stopped making potato gun websites, the internet
           | really stopped blossoming into an amazingly vibrant space.
           | 
           | Entire new genres of creative output - music, fiction,
           | fandom, films, cosplay, hobbyist and enthusiast communities
           | have been spawned by the modern web. It's never _been_ more
           | vibrant.
           | 
           | I'll never understand why people on Hacker News seem to
           | believe the internet stopped evolving as an expressive space
           | just because services replaced the need to design websites by
           | hand. That's like believing literature ended once scribes
           | were replaced by the printing press.
        
             | ulber wrote:
             | Yeah, the internet is bigger now - all of the weird fringe
             | stuff from enthusiasts is still there and there's even more
             | of it. One thing that has happened is that the mainstream
             | is there too now, so if you don't venture outside that its
             | easier to not see the more interesting stuff. In the old
             | days, when just having a "homepage" placed you a bit
             | outside the norm, stumbling on something non-mainstream was
             | more of a given.
        
               | krapp wrote:
               | And the mainstream has become more interesting as a
               | result. It's not uncommon to watch anime now, D&D and
               | video games are no longer niche, people's interests are
               | becoming more diverse as media is no longer being
               | gatekept by communities, geography or publishers, and
               | everything becomes available to everyone. People might
               | see that as a negative, the Eternal September effect
               | eating their favorite thing, but I see it as a positive.
        
           | Avamander wrote:
           | > Once people stopped making potato gun websites, the
           | internet really stopped blossoming into an amazingly vibrant
           | space.
           | 
           | Have they or have you stopped looking? I'd say the former.
        
           | agent008t wrote:
           | I used to go online to escape the boring, unimaginative
           | people and the world they create. Now I have to disconnect to
           | escape them.
        
             | TheFreim wrote:
             | You have to know where to look online, very rare these
             | days.
        
         | Avamander wrote:
         | You either have to visit aggregating sites that have likeminded
         | people (HN/Reddit/Obscure FB groups/Group chats/Forums) or know
         | exactly what you're looking for.
        
         | mycall wrote:
         | Google still works, you just need to be very specific on your
         | search criteria (e.g. site:). I do agree on the premise that
         | good quality data, art and content has been lost to the winds
         | of time.
        
           | rchaud wrote:
           | Needing to use 'site:______.com' in the search query defeats
           | the purpose of an internet search engine.
           | 
           | I agree that it is necessary today, due to the sheer amount
           | of useless sites that pop up on page 1 of the search. I wish
           | Reddit invested more into making their internal site search
           | better. If people did their searches directly on websites,
           | Google would have an incentive to improve search results so
           | it wasn't always the same 10-20 websites topping the list for
           | nearly every query.
        
         | AnIdiotOnTheNet wrote:
         | I wish there was something like Reddit that could be organized
         | by topic, but had the simplicity of HN's design instead of the
         | monstrocity Reddit has become. My guess is it would still
         | succumb to the Reddit Hive Mind effect without a reasonably
         | benevolent moderation team though. For all the times I've said
         | that HN basically does the same thing, I have to admit that it
         | is much better about keeping it in check.
        
           | api wrote:
           | Yahoo did this in the 1990s and it was cool for a while, but
           | the net got too big to maintain the directory. There was an
           | open alternative but it got inundated by spam of course.
        
           | nishparadox wrote:
           | I exactly had the same thought few weeks back on Twitter.
           | Since I have ditched the whole 'social media bubble' for my
           | mental health, it seems sometimes I wish there was some sort
           | of HN-like aggregator for Tweets from my favorite topics and
           | people.
        
           | zanderwohl wrote:
           | Reddit has been pushing to be like other social media sites
           | now. They've added profile pictures and avatars, not to
           | mention they're pushing video content like nothing else.
           | They've got livestreaming and try to saturate your front page
           | with as much video as possible. They've also changed the way
           | their app handles video links to be more like TikTok or
           | YouTube.
           | 
           | It used to be a lot like HN - discussions around links to
           | articles. I wish there was a community with the feel of HN
           | with the wide net of Reddit.
           | 
           | It feels like all social media is converging; Snapchat,
           | Instagram, Facebook, Reddit, TikTok, Youtube, all an endless
           | stream of ai-curated short videos that you can swipe through
           | over and over.
        
             | Avamander wrote:
             | > It feels like all social media is converging; Snapchat,
             | Instagram, Facebook, Reddit, TikTok, Youtube, all an
             | endless stream of ai-curated short videos that you can
             | swipe through over and over.
             | 
             | It's likely that AI curation is the future because no
             | humans can shift through the vast amounts of data and
             | content being made. Things that can't keep up without
             | curation or with just human curation have died or will die.
             | 
             | Good AI curation can bring you the exact content you're
             | looking for, can but not will. I've seen it work times and
             | times again, but I've also noticed that you have to be
             | aware of the flaws of the tool to be able to use them or it
             | gets really bad really quick.
             | 
             | You can't let the AI take control, if it derails to content
             | you don't like you must know what it uses as a quality
             | signal and give it a thumbs down, if it is intentionally
             | derailed, you must stop using the platform.
             | 
             | TikTok recently released an update to their algorithm, it
             | ruined my FYP and replaced my content with inane videos
             | made by people nearby - hyperlocal garbage. The feedback
             | mechanisms given no longer work, before that update they
             | did.
             | 
             | I do think that even people being nostalgic here about the
             | "old internet" should try and learn how to turn AI curation
             | for their own advantage instead of just being sad and
             | nostalgic.
        
             | rchaud wrote:
             | I use third party Reddit apps that largely resemble RSS
             | feeds, and pull directly from the API. So you don't see any
             | of the new social media features. You don't even see ads!
             | 
             | The same is true for desktop, with with the Reddit
             | Enhancement Suite browser extension. My Reddit has looked
             | largely the same for nearly 10 years!
        
             | dkarl wrote:
             | Some subreddits are decent. Use old reddit and go straight
             | to your subreddits so you never see the home page.
        
           | throwawayboise wrote:
           | Gemini protocol might be something here. But it's probably
           | too "techy" to get widespread traction.
        
             | krapp wrote:
             | The Gemini protocol will never gain widespread traction
             | because its designed to appeal to a specific tech-
             | contrarian anti-modernist mindset with restrictions that
             | most people won't find appealing or useful.
             | 
             | And as soon as the mainstream knows about it, and it no
             | longer feels quirky and niche, it will be declared dead and
             | abandoned anyway.
        
           | mrunseen wrote:
           | I don't know about userbase but old.reddit with disabled
           | subreddit CSS is quite usable for me.
        
       | Seattle3503 wrote:
       | This reminded me of a story that hit the front page a few years
       | ago. Even if content sticks around and isn't modified, Google
       | will eventually forget it and you won't be able to find it
       | without a bookmark.
       | 
       | https://news.ycombinator.com/item?id=16153840
        
       | bobsmooth wrote:
       | For anyone that wants to help with this, check out the Archive
       | Team Warrior project. You can donate bandwidth and some CPU
       | cycles to archiving different parts of the web. There's a VM
       | image you can download that makes it really easy.
       | 
       | https://wiki.archiveteam.org/index.php/ArchiveTeam_Warrior
       | 
       | You can choose to help archive reddit, pastebin, URL shorteners
       | and other ephemeral parts of the internet
       | 
       | https://wiki.archiveteam.org/index.php/Warrior_projects
       | 
       | I've also taken to updating the citations in Wikipedia articles
       | with archive links.
        
       | shoto_io wrote:
       | TL;DR
       | 
       | The second law of thermodynamics applies to the Internet as well.
        
       ___________________________________________________________________
       (page generated 2021-06-30 23:01 UTC)