[HN Gopher] How Websites Die
___________________________________________________________________
How Websites Die
Author : herbertl
Score : 75 points
Date : 2022-06-27 08:00 UTC (2 days ago)
(HTM) web link (notebook.wesleyac.com)
(TXT) w3m dump (notebook.wesleyac.com)
| sparcpile wrote:
| There have been some attempts to capture sites that became the
| ghosts or disappeared completely. During the dotcom bubble
| through 2008, Steve Baldwin(co-author of Netslaves) had that as a
| side project.
|
| https://www.disobey.com/ghostsites/
|
| Our Incredible Journey still does this, with more snark and
| humor.
|
| https://ourincrediblejourney.tumblr.com/
| nonrandomstring wrote:
| A very nice read with some poignant thoughts and quotes on
| digital impermanence. Life on the Internet is nasty, brutish and
| short.
| paulgb wrote:
| > you may attempt to archive it, but should you wish to avoid
| sadness down the line, you should accept now in your heart that
| all archives will eventually succumb to the sands of time.
|
| Enjoyed this.
| bombcar wrote:
| You could etch your data and website onto copper plates and
| launch them into the depths of space; this will likely last
| near until the heat-death of the universe!
| asddubs wrote:
| and simultaneously it will be lost immediately
| saagarjha wrote:
| And here we see the difference between available and
| accessible :)
| [deleted]
| doodles33 wrote:
| Unstoppable domains solve this problem by making domain purchases
| permanent and stored on a decentralized blockchain - although
| that brings some problems of it's own - and IPFS solves that by
| not requiring that a central server stays online to serve that
| content, although this does require that _someone_ be insterested
| in serving the website at all.
| rchaud wrote:
| This article isn't about websites going offline. It's about
| websites dying of 'natural causes', meaning the author or site
| manager lost interest in updating and maintaining the site.
|
| There's much more to the soul of a website than the datacenter
| where it's hosted. I doubt if the handful of mirror.xyz
| 'decentralized' sites I see popping up will still exist in 5
| years' time.
| lazyjeff wrote:
| I've been working on this problem for a while. Website upkeep is
| hard to quantify, but basically every disk fails and every
| operating system eventually needs a serious upgrade. The
| timeframe that a system can run continuously is not that long
| compared to the timeframe that information is relevant. So the
| most lightweight way to keep something up and running is to make
| it trivial to port to many hosting configurations by simplifying
| the toolchain needed to rehost it. (Note that humans are part of
| that workflow, if it's a company)
|
| I've written a manifesto about making a commitment to keep
| websites online and maintained for 10-30 years, for people who
| are maintaining web content:
| https://jeffhuang.com/designed_to_last/
|
| And on the flipside (from a user's point of view), I've also been
| working on a background process that automatically captures full-
| resolution screenshots of every website you visit, creating your
| own searchable personal web archive: https://irchiver.com/
|
| I've personally been trying to make a commitment to keep my web
| projects and writing online for 30 years. My original internal
| goal when I started thinking about this, was to outlast all the
| content on Twitter, Google+, and facebook.com. One of those has
| already been met, kind of sadly.
| 10000truths wrote:
| It's not uncommon to hear of rack servers with several years of
| continuous uptime. I wouldn't be surprised if you could keep a
| website online for a decade without touching anything by using
| an LTS distro, enabling unattended upgrades, and running
| something like nginx.
| EddieDante wrote:
| > I've also been working on a background process that
| automatically captures full-resolution screenshots of every
| website you visit, creating your own searchable personal web
| archive:
|
| How are screenshots searchable? They aren't plain text. You
| can't grep them.
| anonymoushn wrote:
| The tool captures screenshots in addition to text.
| doodles33 wrote:
| That seems awfully simillar to archive.org wayback machine.
| I do like to see all these archival projects though, they
| are certainly worthwhile.
| lazyjeff wrote:
| irchiver captures text on the page, and separately OCRs
| the screenshots (specifically, the screenshot from your
| viewport). So you can search just what was shown on the
| page, or what was in the page. Both techniques have pros
| and cons.
|
| While archive.org is fantastic, it can only capture pages
| that are both 1) publicly accessible (i.e. no social
| media content) that it happens to crawl, and 2) static
| content (you're out of luck if the content you want is
| loaded dynamically, or changes depending on user input).
| Jaxan wrote:
| Why not though. On MacOS you can select text in bitmaps
| nowadays. So the tech is there to make a grep for pictures.
| SoftTalker wrote:
| I think it's less a technological problem and more just that
| everyone who used to care about that site or its content no
| longer does. Or the company behind it has gone out of business
| -- who is going to maintain a website for a defunct
| organization, and why would anyone want to?
|
| There's no obligation for a person to maintain anything longer
| than he wants to do it. Putting a blog online is not a lifetime
| committment. Interests change, or you simply realize nobody
| much cares about your online musings, and you move on to other
| interests.
| gowld wrote:
| But what can you do when you _do_ care, to make your website
| as durable as a printed book?
| hypertele-Xii wrote:
| You could literally print it in a book. With links suffixed
| by a page number [915].
| jl6 wrote:
| Point archive.org at it (and make a donation).
| bombcar wrote:
| Depends on if you want it to survive _without_ you
| maintaining it. If so, something like a bog-standard
| Wordpress blog _hosted by them_ might work until they
| decide they don 't want to bother anymore.
|
| Otherwise, some setup using S3-as-a-website or GitHub pages
| may work, but those also depend on the company maintaining
| that service.
|
| If your entire website can be dropped into a ZIP file and
| served anywhere, you have a greater chance in it surviving,
| especially if the internet archive got a copy at some
| point.
|
| But if you die and _nobody cares_ about the content,
| eventually it will disappear.
| ElectricalUnion wrote:
| > If your entire website can be dropped into a ZIP file
| and served anywhere, you have a greater chance in it
| surviving, especially if the internet archive got a copy
| at some point.
|
| Well, your entire website can be a zip:
| https://redbean.dev/
| SoftTalker wrote:
| Keep it as simple as possible. Static HTML that can be
| dropped on any web server.
|
| HTML/HTTP may some day be obsolete, but will likely be
| around longer than anything built on complicated javascript
| frameworks or tightly tied to current web browser
| technologies.
| dybber wrote:
| When a company goes out of business you will typically also
| try to sell of all its assets, and a good domain name might
| be such an asset.
| anonymoushn wrote:
| irchiver seems incredible. I hope there's a comparable product
| for other OSes one day.
| ryanfox wrote:
| I've been working on a very similar thing which runs on
| Windows, Mac, and Linux: https://apse.io
| jaytaylor wrote:
| > I've often thought about getting together with some friends to
| pay into a fund to house our websites after we die. I don't think
| setting that up would be too hard -- the math around insurance
| policies of this sort is quite simple -- I mostly haven't tried
| to set something like this up just since it's a pretty morbid
| ask. But, if you'd be interested, maybe reach out to me?
|
| > Our ghosts could live forever, if we help each other.
|
| I love this idea and would gladly assist in the effort, let's set
| it up :)
| fleddr wrote:
| I'd like to call out the somewhat related problem of website rot.
| Meaning, the websites is online, it once worked perfectly, but
| becomes increasingly dysfunctional due to technical deprecations.
|
| The soft obligation to use HTTPS these days has deranked old
| HTTP-only websites in search, making them hard to find. These
| websites are also "defaced" with browser warnings or some
| subresources may not load at all.
|
| Embedded maps no longer work, since Google regularly breaks their
| API.
|
| Facebook login or other FB plugins no longer work, since it needs
| a yearly checkup of your account and there's the new requirement
| of needing to have a privacy policy.
|
| Those are just some examples of websites partially breaking
| through no fault of its creator, if you'd agree that the web
| should be backwards-compatible.
| asddubs wrote:
| also, even if those older http sites get a certificate, any
| embedded scripts that point at http URLs, even if those URLs
| are also available on https, will not load and break.
| Especially hard to fix if you have user generated HTML on
| there.
|
| Same for http download links from https websites, will not work
| anymore.
|
| anything that used cookies on embedded resources will also be
| broken because of the missing samesite header (same for third
| party cookies obviously)
|
| if google really goes through with removing alert/prompt/etc
| from their browser like they said they were planning to a while
| back, so so many things will break
| EamonnMR wrote:
| Notification prompt is 10x more annoying than alert ever was.
| [deleted]
| fleddr wrote:
| I'm uniquely bothered by this problem as I visit many such
| dysfunctional sites.
|
| I'm active in the (hobbyist) field of documenting species.
| There's thousands upon thousands of websites created by
| amateurs containing unique niche content. For example,
| somebody might have made it their lifelong hobby to document
| every species of bee in their territory.
|
| It's a fragmented mess of incompetently produced websites,
| but I find it incredibly charming and in the spirit of the
| original web. Above all, it is their content that has lasting
| value.
|
| The people behind it are good, generous. That's why it makes
| me so angry when their work is cast aside like this. Things
| not just technically breaking, in many cases simply
| disappearing altogether from search results.
| sshine wrote:
| My first domain "expired" in an unexpected way after 18 years. I
| got a .eu.org because I believed that eu.org would be more stable
| than a commercial provider. I used the same not-for-profit DNS
| provider until they were commercially acquired and the parent
| company shut down the old nameservers.
|
| Now I'm locked out: eu.org does not respond to inquiries, and my
| account predates the auth system. While my phone number is the
| same, auth reset does not work with phone.
|
| It would have been fun to retain the same domain forever, but
| stuff breaks, people die, and things crumble.
| spc476 wrote:
| It comes down to the person running the website _has_ to care.
| That 's it. It doesn't matter how simple it is if the person
| doesn't care.
|
| In my own case, I've been running my own website for 24 years now
| [1]. The URLs I started out with have remained the same (although
| some have gone, and yes, I return 410 for those) and the
| technology hasn't changed much either (it was Apache 24 years
| ago, it's still Apache today; my blog engine [2] was a C-based
| CGI program, and it's still a C-based CGI program. The rest of
| the site is static, and there's no Javascript (except for one
| page). I can see it lasting at least six more years, and probably
| more. But I care.
|
| [1] Started out on a physical server (an AMD 586) and a few years
| later on a virtual server.
|
| [2] https://github.com/spc476/mod_blog
| remus wrote:
| > It comes down to the person running the website has to care.
|
| Personally I think it is a little more nuanced, in particular I
| think the relationship between how much someone cares and how
| much effort is required to keep the website online is what
| matters.
|
| If your website is super simple you don't need to care about it
| very much (though you do need to care at least a little bit).
| On the other hand, if everyone who works at google suddenly
| quit tomorrow because they didn't care the stack of cards would
| fall over very quickly because it's a lot of work maintaining
| millions of servers.
___________________________________________________________________
(page generated 2022-06-29 23:00 UTC)