[HN Gopher] A million ways to die on the web
       ___________________________________________________________________
        
       A million ways to die on the web
        
       Author : hexage1814
       Score  : 212 points
       Date   : 2024-01-18 09:02 UTC (13 hours ago)
        
 (HTM) web link (wiki.archiveteam.org)
 (TXT) w3m dump (wiki.archiveteam.org)
        
       | hyperluz wrote:
       | My yahoo e-mail account was Yahoo-ed (one day, I logged in on it
       | to find Yahho deleted all my e-mails (some of then important and
       | others of sentimental value), due to account inactivity for a new
       | arbitrary number of days not previously informed.
        
         | Erratic6576 wrote:
         | my geocities account was destroyed, putting an end to my career
         | as webmaster. I grew a resentment towards that website
        
           | AznHisoka wrote:
           | Incidentally my angel fire site is still up for some insane
           | reason
        
             | ssss11 wrote:
             | Omg I haven't heard the name Angel fire in YEARS
        
             | dandrew5 wrote:
             | Same! Coincidentally, I recently backed up my Angelfire
             | site from the late '90s. A lot of the original links were
             | missing but thankfully they provide a `sitemap.xml` and I
             | used HTTrack to make a local copy.
        
         | tetris11 wrote:
         | I fear this for my gmail. I now use mbsync (lieer[0]) to have
         | my emails synced locally on my homeserver, and then browse it
         | with notmuch[1]. It's an incredibly freeing experience to have
         | all your email on your own machine.
         | 
         | 0: https://github.com/gauteh/lieer
         | 
         | 1: https://notmuch.readthedocs.io/en/latest/man1/notmuch.html
        
           | girishso wrote:
           | For the exact same reason I started using MacOS Mail app with
           | gmail, but I am realising now that it doesn't download "all"
           | the emails unfortunately. Lieer looks like a good option,
           | thank you.
        
             | tetris11 wrote:
             | It's a great tool, it's just agonizingly slow sometimes
             | because Google likes to throttle the connection when you
             | make big changes. The initial download especially is slow,
             | but the small changes thereafter are pretty fast.
        
       | throwaway167 wrote:
       | + Politically regulated out.
       | 
       | The political environment the site operates in turns hostile to
       | website content or method of publishing. The operators face costs
       | of compliance, loss of scope, or personal risk in continuing.
       | 
       | See: The UK Online Safety Bill [1] [2] and especially [3]
       | "Ofcom's >1,500 page consultation on the Online Safety Act 2023,
       | and why small companies don't have a chance"
       | 
       | [1] https://www.theverge.com/2023/10/26/23922397/uk-online-
       | safet...
       | 
       | [2] https://www.eff.org/deeplinks/2023/09/uk-government-knows-
       | ho...
       | 
       | [3] https://decoded.legal/blog/2023/11/ofcoms-%3E1500-page-
       | consu...
        
       | livrem wrote:
       | They forgot one: Natural disaster deleted all the data and there
       | was no backup.
       | 
       | I was a paying subscriber to a great little site called magweb
       | ~20 years ago. They scanned old and new (military) history and
       | (war) game magazines and posted HTML versions, with permission of
       | original publishers. It was really nice and explicitly allowed
       | users to print or save copies of articles. No DRM. Perfect
       | example of how a site for reading magazines online should work.
       | Everything just basic HTML that even worked great to read in
       | phone browsers 20 years ago.
       | 
       | Then Hurricane Katrina hit and the server that was apparently
       | running out of the owner's basement somewhere in that area was
       | flooded, with no working backups. I still have not found any
       | traces of most of the 40000+ articles, other than the few I had
       | saved, that used to be on that site. Since it was paywalled only
       | a tiny part of the site is available on the wayback machine.
       | 
       | https://web.archive.org/web/20050529083811/http://www.magweb...
       | 
       | (No, I can't imagine running a business and scanning magazines
       | for 9+ years and not make sure to have backups of everything.)
        
         | Hamuko wrote:
         | There was also the OVHcloud data centre fire in 2021, although
         | it's debatable if it'd fall under the same "natural disaster"
         | category.
         | 
         | https://www.datacenterdynamics.com/en/news/ovhcloud-ordered-...
        
           | achairapart wrote:
           | I know at least one startup that closed its doors because
           | they lost both production and backup data on that OVH data
           | center.
        
             | Snow_Falls wrote:
             | And that's why 3-2-1 specifies to have one _offsite_ backup
        
               | dspillett wrote:
               | Preferably with a completely different provider, in case
               | something takes out there entire network somehow (or they
               | go out of business).
               | 
               | But make sure you check where the other provider is
               | hosted: I read it one user (individual, not a startup or
               | other company) who had backups of stuff on a shared host
               | with a second shared host. It turned out that they were
               | both running on servers in that one OVH DC and they were
               | cheap hosts with no better backup/dr plan themselves...
        
               | pixl97 wrote:
               | https://xkcd.com/908/
        
               | NoZebra120vClip wrote:
               | I'm gonna Craigslist all your stuff!
               | https://xkcd.com/1150/
        
           | benxh wrote:
           | I personally was affected by this fire, although I've always
           | kept 3 month backups of production data, encrypted, on-site,
           | just in case of emergencies like this. Haven't touched their
           | services for anything production related ever since
        
         | gary_0 wrote:
         | Similarly: Terrorist attack. After 9/11 I remember that some
         | websites for WTC-based companies went offline, presumably
         | because they were hosted on computers in the building.
        
       | forgotmypw17 wrote:
       | There's also giving up in protest of a lawsuit, one of the oldest
       | causes:
       | 
       | >Altern.org is a free web hosting service created in 1992 by
       | Valentin Lacambre and disappeared in 2000. From its origins to
       | the closure, Valentin Lacambre, a pioneer of Free Internet in
       | France, had to permanently close the free hosting service in
       | early July 2000 following numerous lawsuits. This closure was due
       | to the laws of the time, which placed the delicate obligation on
       | hosts to act as judge, censor, and, by default, guilty, as it was
       | deemed difficult and contrary to his principles to control the
       | 21,893 sites that existed on Altern.org at the time of closure.
       | 
       | http://yavista.com/98/1f/981fa5fe.html
       | 
       | https://fr.wikipedia.org/wiki/Altern
       | 
       | (Valentin Lacambre went on to be a co-founder of Gandi.net.)
        
         | diggan wrote:
         | > (Valentin Lacambre went on to be a co-founder of Gandi.net.)
         | 
         | Gandi.net which recently (last year) was purchased by some
         | other corporate entity and subsequently raised all the prices +
         | made previously free features paid. I only mention this as I
         | discovered this week that my Gandi bill suddenly got a lot
         | bigger.
        
           | guitarlimeo wrote:
           | Yeah I'm in the process of migrating away because of it.
           | Sigh. Gandi had a good run.
        
             | diggan wrote:
             | Same here :/ Saw the writing on the wall when the previous
             | purchase/sale happened to some investment company or
             | whatever it was, but thought I could hold off my migration
             | for a while longer...
        
         | profsummergig wrote:
         | Tigerdirect comes to mind.
        
       | inopinatus wrote:
       | There is a tale - perhaps apocryphal - handed down between
       | generations of AWS staff, of a customer that was all-in on spot
       | instances, until one day the price and availability of their
       | preferred instances took an unfortunate turn, which is to say,
       | all their stuff went away, including most dramatically the
       | customer data that was on the instance storages, and including
       | the replicas that had been mistakenly presumed a backstop against
       | instance loss, and sadly - but not surprisingly - this was pretty
       | much terminal for their startup.
        
         | gonzo41 wrote:
         | That's like a gamma ray burst hitting the planet and just
         | vaporizing it. So unlucky. But obviously fate had other ideas.
        
           | sidibe wrote:
           | Not really with spot instances it was just waiting to happen
           | on a day with more demand than usual. That is 100% expected
        
           | sethammons wrote:
           | More like a balancing rock statue in an area that can have
           | earthquakes
        
         | HPsquared wrote:
         | Common mode failure.
        
         | BossingAround wrote:
         | How..? Was there no local code on any of the dev machines? No
         | git? I'm asking because for example if Github is vaporized
         | today, my product would lose roughly a day or two's worth of
         | work, since we have like 30 computers having a repository copy.
         | 
         | Of course redeploying every single thing would not be seamless
         | because of course, there might be some configuration stored in
         | services, or something similar, but I'd say that ~90% of our
         | automation is stored in Git.
        
           | Jasp3r wrote:
           | They lost customer data, not source code. You shouldn't have
           | a local copy of all user data on your machine.
        
             | appplication wrote:
             | But my customers enjoy the personal touch of me manually
             | editing their SSN in my local MS Access database.
        
               | taneq wrote:
               | I mean, after all, if you don't have your own copy of the
               | MS Access database then when your team scales beyond
               | about 5 people that database is going to get harder to
               | access. So really everyone should have a copy of all
               | important PII. :P
        
             | pixl97 wrote:
             | If something is on the cloud does 3-2-1 backup stop
             | applying?
        
               | hawski wrote:
               | I would also like to know the answer. Would it be a good
               | idea for the company to keep _encrypted_ backups on their
               | machines/HDDs? Not a laptop somewhere, but something just
               | a bit more involved.
        
               | jareklupinski wrote:
               | i think for company-critical databases, the best you can
               | do without invoking a terrible headache for your security
               | officer is going multi-cloud: one big tech cloud, and one
               | smaller firm that is completely disconnected from the
               | other one
               | 
               | maybe they could even use a relatively inexpensive
               | colo/baremetal provider to simply mirror the bigtech
               | deployment on a smaller scale (would need to be quite
               | flexible/vendor-agnostic to make that work...)
        
               | ianburrell wrote:
               | It would make sense to keep backup on hard drive stored
               | in safe in office. Doing it weekly would be reasonable
               | but would have to accept that going to lose a week's
               | worth of data.
               | 
               | The main problem is that would outgrow single hard drive
               | so would need NAS. Also, the transfer speed could be an
               | issue as database gets bigger. Even if don't store all
               | customer data, it does make sense to store all the
               | configuration, keys, and secrets.
        
               | rcxdude wrote:
               | Yes. Having a copy you can "touch" is important. At the
               | absolute minimum you should have it on another cloud
               | service.
        
               | tehlike wrote:
               | You can still do off-site backup to another cloud.
        
               | drewzero1 wrote:
               | Only if you're willing to stake your company's digital
               | existence on the reliability of another company's cloud
               | service.
               | 
               | If anything, it increases the need for 3-2-1 backups: the
               | original copy of all of your files are on somebody else's
               | computer that you have no control over. Hopefully they're
               | keeping it backed up, and hopefully they don't go belly
               | up and pull the plug all of a sudden. So you can use a
               | primary backup in another cloud service from another
               | company that hopefully won't kill their product at the
               | same time as the other one (again, you have very little
               | knowledge or control of the way they run their data
               | center). Ultimately, it's a good idea to have a copy of
               | your data that you have control over, maybe in a big
               | drive (or set of drives, tapes, etc) in the safe, rotated
               | daily/weekly/however long your company can cope with
               | losing in a major SHTF situation.
               | 
               | Excessive? Maybe. For what it's worth my shop is locally
               | hosted with both local and cloud backups. I have never
               | regretted having at least one backup of anything and it's
               | saved my bacon (or my coworkers', boss', etc.) a number
               | of times. I've been fortunate to never need to rely on a
               | secondary backup, but I sure wouldn't bet the company on
               | it.
        
             | BossingAround wrote:
             | Ah that makes more sense, I can't read. I thought that the
             | project stopped working all together, hence the startup was
             | finished. I didn't realize it meant that they simply lost
             | enough customers to go under.
        
               | halostatue wrote:
               | A company's source code is mostly valueless. A company's
               | customer data is priceless.
               | 
               | As Fred Brooks said in Mythical Man-Month: Show me your
               | flowcharts and conceal your tables, and I shall continue
               | to be mystified. Show me your tables, and I won't usually
               | need your flowcharts; they'll be obvious.
        
         | pilotneko wrote:
         | Crazy, but even AWS resources are not unlimited. In 2023, I
         | experienced multiple days where g4dn instances were not
         | available in us-east-1 (in any AZ).
        
           | jethro_tell wrote:
           | Heh, they still live in the real world where they have to
           | order or build servers put them in a rack, configure them to
           | be added to the pool.
           | 
           | At their scale lots of their stuff is custom and needs to be
           | ordered at least 18 months in advance.
           | 
           | The fact that they can do capacity planning 2/3 years in
           | advance and have very limited misses in a way that people are
           | astonished that they have capacity misses is a testament to
           | how good they ate at it.
        
             | glitchcrab wrote:
             | I think a lot of people who have never really used anything
             | other than a major cloud provider don't really understand
             | what goes on behind the scenes. I was a sysadmin at a
             | hosting provider and even though we were magnitudes smaller
             | than AWS, we still saw a lot of the same issues, just on a
             | smaller scale.
        
         | darkwater wrote:
         | But hey, think of the money they saved before that!
        
         | Thorrez wrote:
         | The third largest bitcoin exchange made a change to their RAM
         | settings in EC2. This shut down the machines, wiping out the
         | hard drive and RAM. Their wallet was stored there. They lost
         | everything.
         | 
         | https://siliconangle.com/2011/08/01/third-largest-bitcoin-ex...
        
           | lnrd wrote:
           | Third largest bitcoin exchange...in 2011.
        
             | Synaesthesia wrote:
             | Well the bitcoin supply is limited, so it's quite a big
             | loss.
        
               | stavros wrote:
               | For them! Due to the way the economy works, destroying
               | money is actually a donation to everyone else who has
               | money (deflation).
        
               | mjburgess wrote:
               | Still hard to imagine this wasnt ever just laughed out
               | the room immediately
        
             | paulddraper wrote:
             | ...with 17,000 BTC.
             | 
             | (Valued at 220k USD then, or 700M USD today)
        
           | lifestyleguru wrote:
           | So funny reading comments from that era to the articles about
           | this incident. Some say they were lucky because they are
           | using Mt. Gox! Another observation - bitcoin at that time was
           | by many perceived as and had reputation of in-game purchases.
        
         | profsummergig wrote:
         | Were they paying for said spot instances? If so, that's mostly
         | on AWS, not the client. (Unless AWS explicitly says in its TOU
         | that instances can be taken away instantly due to
         | "availability" issues. Which IMHO would be a suicidal policy
         | for AWS to have.)
        
           | dilyevsky wrote:
           | It is not suicidal and it is in fact what it says. You get
           | 120s notice and there's no capacity sla even for on-demand or
           | so called "reserved instances". You need to setup https://doc
           | s.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-capa... to get a
           | guarantee and they will obviously bill you for it even if
           | you're not using it.
        
         | dilyevsky wrote:
         | Having all your customers' data on instances local drives
         | (sounds like they didn't use EBS) with no backups sounds pretty
         | dumb, spot instances or not. Those weren't serious people.
        
       | omnibrain wrote:
       | I hosted my first personal website and my Star Treck (sic!)
       | fansite on Tripod. At some point in time they also had a data
       | loss and one of both (along with a lot of other sites) was gone.
       | 
       | This is not mentioned here
       | https://wiki.archiveteam.org/index.php/Tripod because I think the
       | event precedes Archive Teams formation.
        
         | 101008 wrote:
         | Tripod and Lycos were great for kids who were learning HTML and
         | wanted to have their own website. I remember the annoying
         | popups and then banner frames, and how I tried to copy and
         | paste a lot of JS to remove them (so unfair for them, they were
         | providing a free service!)
        
       | weinzierl wrote:
       | _" Associated data"_ is a point I would not have immediately
       | thought of. As data becomes more and more connected and services
       | consolidated this becomes more important to consider.
        
       | zoobab wrote:
       | Loosing memory is an issue, the question is how to maintain the
       | archive?
        
         | k__ wrote:
         | There are a few solutions:
         | 
         | https://www.arweave.org/
         | 
         | https://www.lighthouse.storage/
         | 
         | To me, they seem like the most useful stuff coming out of the
         | blockchain industry.
        
         | hexage1814 wrote:
         | Tough true? By preserving it yourself.
        
         | pabs3 wrote:
         | Have ArchiveTeam save it to archive.org and donate to keep the
         | service alive.
        
       | RamblingCTO wrote:
       | Wait giphy was bought by meta? What a shame! And they don't sell
       | it off, as they're supposed to? How doesn't it surprise me. A lot
       | of the web feels like being destroyed by FANG. And Andreesen goes
       | on claiming that the market prevents monopolies ...
        
         | blatherard wrote:
         | Meta bought Giphy in 2020, then the UK CMA ruled that it was
         | anti-competitive, so Meta sold Giphy to Shutterstock last year.
        
           | RamblingCTO wrote:
           | You're right! I read (in the original article?) that it
           | wasn't completed yet, but it was:
           | 
           | > The acquisition was completed on June 23, 2023.
        
         | rvnx wrote:
         | VCs in one hand have successful companies (that generally
         | IPOs), and losers on the other hand.
         | 
         | They can use their influence on the people who having voting
         | rights, to actually push the hot-potato to public investors.
         | 
         | That's one big reason why you public companies purchasing
         | somewhat useless companies, or acquihiring them at insane
         | valuation.
        
       | fauigerzigerk wrote:
       | It's a shame that none of substitutes for DNS seem to have gained
       | any traction so far.
       | 
       | Renting names that serve as resource identifiers, locators and
       | trademarks all at the same time is just not a good idea.
        
       | dirkc wrote:
       | I'd add "maintenance burden". I have a few old properties I'm
       | responsible for that used to be web apps, one is archived as
       | static HTML, the other one still exists as an outdated web app.
       | Every other week I receive a request to delete user data. At some
       | point it might make sense to pull the plug on everything :(
        
         | pabs3 wrote:
         | Please do let ArchiveTeam know before you pull the plug so at
         | least the public data can be saved.
        
       | thih9 wrote:
       | Digged - via website redesign:
       | https://en.wikipedia.org/wiki/Digg#Redesign
        
         | rvba wrote:
         | Interesting that the article does not show digg, which killed
         | itself.
         | 
         | On a side note, the current form of digg is interesting, yet
         | somehow poorly done. Go there, sort by year: get only things
         | from 2024, since no way to get 2023 or a rolling year. Not to
         | mention being able to choose all or certain time periods
        
           | probably_wrong wrote:
           | The current Digg is doing its best to destroy the goodwill
           | they collected with the slightly-less-current Digg.
           | 
           | I started going there regularly about a year and a half ago,
           | mostly because they would feature articles that I wouldn't
           | find in my other typical websites. But in the last six months
           | they have been optimizing to death, cutting anything that's
           | not an instant success and publishing variant after variant
           | of anything that's mildly successful (yay, another article on
           | Twitter memes!). Clickbait titles are there too, along with a
           | new-ish comment section that's 90% spam.
           | 
           | It's been a sobering lesson on what happens when you put
           | growth above everything.
        
           | DamnInteresting wrote:
           | On another side note: I have been running an interesting link
           | recommendation feed for many years now, and evidently in
           | ~2018 someone curating links for Digg found my feed and
           | started relying on it heavily to populate Digg's front page.
           | It went on for months. Some days as much as half of the links
           | I posted would subsequently show up on Digg, including links
           | to unusual and old content (which was a strong signal that
           | the overlap was no mere coincidence).
           | 
           | In 2019 Digg posted a job listing for a links curator, and I
           | cheekily applied, noting that I'm already doing the job
           | anyway, so they might as well pay me for it. They didn't take
           | me up on it, but like magic, the poaching went away.
        
             | rvba wrote:
             | How do you even make such a list nowadays? In the past tou
             | could take stuff from forums, but now forums are dead.
             | Facebook is trash so apart from known sources (is Rss
             | dead?)organicly finding new stuff sounds harsh. Apart from
             | maybe coppying from reddit / digg / wykop / some spanish
             | reddit equivalent...
        
               | DamnInteresting wrote:
               | A few years ago I made a tool that fetches content from a
               | big list of primary sources (via RSS or HTML), and pushes
               | each link it finds through filters (keyword blacklist,
               | duplicate check, etc). I made a UI that lets me accept or
               | reject links Tinder style, and when I have a scrap of
               | time to fill, I assess a few links.
               | 
               | I also have a small group of well-read friends who make
               | an effort to send me stuff, that helps a lot too.
        
         | hamolton wrote:
         | Probably would categorize this under "teh futurez!1!" in the
         | original article
        
       | doublerabbit wrote:
       | Suppose not full kill, but moderation
       | 
       | Habbo Hotel: It was marked that sexual predators were using the
       | service to groom and instead of tackling the issue, they applied
       | a worldwide mute. You could walk but not talk.
       | 
       | Habbo still exists, but almost killed the whole fanbase.
       | https://en.wikipedia.org/wiki/Habbo#Moderation
       | 
       | And that, you required Shockwave.
        
       | ChrisMarshallNY wrote:
       | There's also "Sea Change," where a Web site that was one way,
       | changes to become another way.
       | 
       | An example is BBSpot.com. It's still up and going, but very
       | different from many years ago.
       | 
       | I miss SatireWire. I think it's dead now (ERROR ESTABLISHING A
       | DATABASE CONNECTION).
        
         | flyinghamster wrote:
         | That happened to mp3.com as well - it was once a Bandcamp-like
         | site, until it was acquired by CBS Interactive after a
         | disastrous music locker service got it nuked from orbit by the
         | RIAA. It's now, for all practical purposes, a parked domain.
         | 
         | I still have a CD-ROM from them (back then, everybody and his
         | brother published a CD-ROM).
        
       | daniel31x13 wrote:
       | This is one of the main reasons I created Linkwarden - an open-
       | source collaborative bookmark manager to collect, organize and
       | preserve webpages:
       | 
       | GitHub: https://github.com/linkwarden/linkwarden
       | 
       | Website: https://linkwarden.app
        
       | qingcharles wrote:
       | I'm frustrated by the fact there are zero archives out there of
       | TwitterX, Instagram or Facebook. Even big brands have shuttered
       | their accounts and now none of their content exists anywhere any
       | longer.
        
       | 1970-01-01 wrote:
       | Browsing into my 'Dead Bookmarks' folder, most of the links were
       | either websites without enough funding to keep the servers
       | online, acquired startups, or streaming services that were killed
       | off by the giants' lawyers. Bash.org is the latest to experience
       | the drag and drop of death.
        
         | archerx wrote:
         | Oh no! I just lost CG Society and now bash.org is dead? I feel
         | like all the places teenage me used to hang out are all dying
         | out slowly and its sad because those days we believed
         | everything on the internet is forever.
        
           | drewzero1 wrote:
           | The stuff you want to delete from the internet (embarrassing
           | photos, bad takes from a decade ago) are forever, but the
           | ones you want to keep (great hangouts, cool personal
           | webpages) are fleeting.
        
             | declaredapple wrote:
             | This is the difference between security vs archival
             | perspectives.
             | 
             | For security you should assume someone recorded it
             | indefinitely.
             | 
             | For archival you should assume nobody recorded it including
             | the original creator.
        
               | drewzero1 wrote:
               | I wish I could've put it so well myself!
        
       | NanoYohaneTSU wrote:
       | Look inside A Million Ways To Die. There isn't a Million Ways To
       | Die.
        
       | profsummergig wrote:
       | > teh futurez!1!
       | 
       | It's a sad list. Even Google, after it acquired YouTube, forced
       | me to change my YouTube login to a gmail account.
        
       | urbandw311er wrote:
       | This should be renamed "ten ways to die on the web"
        
       ___________________________________________________________________
       (page generated 2024-01-18 23:01 UTC)