[HN Gopher] Is stuff online worth saving?
___________________________________________________________________
Is stuff online worth saving?
Author : Brajeshwar
Score : 48 points
Date : 2024-12-17 14:31 UTC (4 days ago)
(HTM) web link (rubenerd.com)
(TXT) w3m dump (rubenerd.com)
| pabs3 wrote:
| If you're interested in that sort of thing, come hang out with
| ArchiveTeam:
|
| https://wiki.archiveteam.org/
| underseacables wrote:
| _I suppose it comes down to what the purpose of such archiving
| is._
|
| I think it's the preservation of information, but I also believe
| 90% is absolutely pointless. There is just so much of it, and
| data storage so cheap, that it makes sense to just save
| everything.
| sigio wrote:
| Well... storage is cheap, but not cheap enough to save
| everything, with just usenet being in the 400TB/day range these
| days. Sure, it's cheap enough to save every webpage you visit
| during your life, but probably not cheap enough to save every
| video you click on youtube or watch on a streaming-service, and
| all the music you listen to all day.
|
| Though just the music compressed in opus at 128kbit might work
| ok, 60 years of 24/7 128kbit is 30TB, so that would fit on 1
| large HDD currently.
| saulpw wrote:
| Music is actually an ideal candidate. I don't listen to music
| all day, and when I do listen to it, it's often something
| I've listened to before. My current collection is about 200GB
| and that includes a ton of stuff I've never listened to; it
| seems reasonable that a full life's worth of music could fit
| in 1TB, easily.
| dreamcompiler wrote:
| That data storage is also ephemeral. Nobe of it will last as
| long as a paper note, unless some human goes to the trouble of
| copying it all onto new drives with new software every ten
| years or so.
| Atreiden wrote:
| With a proper NAS and RAID10 for double parity, it's a bit
| like Theseus ship. Just keep swapping out drives when they
| become unhealthy and you never have to rebuild or migrate
| ninalanyon wrote:
| Eventually the controller will die and eventually
| compatible ones will no longer be produced or will at least
| be inconvenient to obtain or commission and hence
| expensive.
|
| Paper lasts for centuries without any attention beyond
| keeping it moderately dry and away from things that eat it.
| emptiestplace wrote:
| No sane person uses hardware RAID in 2024, if that's what
| you're referring to.
| zamadatix wrote:
| Whether you're using hardware RAID or not you still need
| a hardware storage controller of some type which accepts
| the new disks you can buy and works with the NAS. What
| they are saying is eventually that'll be more $ and time
| than just migrating off the system would be. From ENIAC
| to now could fit in one lifespan, would you still be
| maintaining a home floppy drive backup system in the
| 2040s or just save the time and effort with a migration?
| danielbln wrote:
| Data rots though, you can't just save it once and be done with
| it. You have to migrate it across storage mediums, formats etc.
| It's a recurrent effort/cost.
| bdhcuidbebe wrote:
| More planning for less effort.
|
| Do your research first. Use standards
|
| Eg: html, pdf, h264/h265/av1 in mp4 container, chd, zip and
| so on depending on what you are storing.
| HeatrayEnjoyer wrote:
| On what physical medium?
|
| I have 1 terabyte of data in 1860, how do I make sure the
| storage medium is still intact in 2024?
| JKCalhoun wrote:
| One hundred twenty-three years ago my great grandmother's first
| husband died in a hotel in Kansas City from asphyxiation from the
| gas having been left on over night (the hotel did not yet have
| electric lighting). A letter was hastily written on a piece of
| hotel stationary to be delivered to his wife in the neighboring
| farming community where she lived.
|
| It is fortunate to me that someone thought to hang on to that
| note since I have become interested in genealogy and this was a
| fairly significant event in family history (had he not died I
| don't suppose I would be around since it was her second marriage
| that gave me my grandfather).
|
| I long for scraps of _anything_ that my dead relatives, wrote,
| created, etc. It connects me better to the past -- the lives they
| lived, how they lived them. It somehow grounds me a little better
| ... well, it 's rather hard to explain the draw of genealogy.
|
| Sadly very little of the ephemera of everyday life was kept. I
| get it. It might have seemed like hanging on to junk mail -- like
| you were a hoarder or whatever, but in this digital era we should
| be able to hold terabytes of what may appear to be ephemera.
|
| I'm doing what I can - not for ego, I think, but for future
| generations that may find a connection to their past interesting.
| willis936 wrote:
| 30 years ago there was no digital world. Nearly all information
| was in physical artifacts. The things worth saving haven't
| really changed, but the amount of noise they are buried in has.
| Imagine if that letter was kept in a two ton pile of ad fliers.
| Sure, someone would find some of those fliers interesting, but
| you'd have been much less likely to even know about the letter.
| palmfacehn wrote:
| >...a two ton pile of ad fliers
|
| Alamy is selling scans of ad prints from the 1850s.
|
| https://www.alamy.com/stock-photo/1850s-advert.html
| chgs wrote:
| Because they are rare
| chefandy wrote:
| I don't think that's true? Tons of stuff from that era
| had been digitized, even before newer more relevant stuff
| and older rarer stuff, because the acid paper had a short
| shelf life and there were so many ads in printed stuff
| then. I might have a skewed perspective from working in
| the digitization world for quite some time. I think
| they're selling what they sell with all their other
| content-- discovery, curation, preparation, and easy
| delivery.
| chefandy wrote:
| Ads range from a (necessary, in a capitalist society)
| nuisance to a scourge, and people justly put up
| increasingly thick boundaries to shield themselves from
| their influence. When waning cultural relevance or whatever
| dilutes that influence, you can more easily see the ads for
| what they are-- often manipulative marketing tactics
| implemented through often genuinely beautiful art and
| design. Both aspects are fascinating to consider and the
| art can be quite enjoyable. Early modernist posters from
| Paris are _beautiful_. Watching collections of mid century
| television ads in the prelinger archives is fun, and tells
| us a lot about the ways we are influenced by modern ads
| speaking to current perspectives, fashions, and concerns.
| zamadatix wrote:
| A selection 74 items over a 10 year period is a different
| proposal compared to e.g. keeping two tons of ad fliers
| from November 17th 1907 (and every other thing, every other
| day, all the time).
| qwertox wrote:
| What about robots reading each flier and checking if
| something is odd about that particular one? It could find the
| letter and report it to you. Even easier if it was all
| digital information.
| jonhohle wrote:
| An aside about ad spam from companies that I occasionally buy
| from:
|
| Often as spam comes from the same mailbox as order receipts
| and includes words like "order" while messages with receipts
| never include the word "receipt". When inundated with daily
| or sometimes multiple times a day ad spam from the same
| company it becomes very difficult to filter for only not
| receipts, to clean a neglected inbox.
|
| After I'm gone, I fully expect my family just to delete it
| all because the signal to noise is so low.
| sdenton4 wrote:
| Sorting through twenty years of spammy email is one of
| those things that seem like an llm would actually be good
| for.
| justsomehnguy wrote:
| I don't have anyone to do anything after I'm gone, so I
| just delete the emails myself. I do keep the notable ones,
| like registration information and _some_ payment receipts
| but otherwise everything goes to the trash.
|
| Bonus points:
|
| I don't need 30/50/100Gb mailbox (and the associated
| mailbox cost nowadays).
|
| Search is not only fast but if I didn't found something -
| then there is nothing of this something in the mailbox.
|
| I't mentally pleasurable to log in once in a while and
| throw a bunch of _unneeded stuff_ into the trash bin, quite
| similar to a real life room cleaning.
| eesmith wrote:
| A two-ton pile of ad fliers? Sounds like Ted Nelson's Junk
| Mail collection,
| https://archive.org/details/tednelsonjunkmail .
| bongodongobob wrote:
| If only we had search algorithms...
| kerkeslager wrote:
| Sure, there are a ton of reasons to archive. And if it's cheap
| to do (in terms of money, yes, but also in terms of time,
| effort, mental health, etc.) then I am of the mind that we
| should archive _everything_.
|
| But, it often _isn 't_ cheap to do, and in that case, it makes
| sense to prioritize. The high priority items for me are the
| things that I might want to share, the ideas I want to amplify
| for my contemporaries and future generations that might examine
| my life. Stuff like [1] [2] and [3] which has influenced my
| thinking fundamentally, that I hope to build upon so that
| others can build upon what I have built.
|
| I'd argue that you do this intuitively: you're mentioning a
| letter from your family's past _because it is a high priority
| item_ --it's relevant because it was the last written words of
| your great-grandmother's first husband.
|
| But, there's a lot that _isn 't_ worth keeping. My first form
| of archiving as a teenager was keeping ticket stubs for movies
| and concerts--a decade later I was going through my pile and
| found that I didn't even remember most of them. The better
| movies, I remembered--and I had them on DVD. The better
| concerts, I remembered--and I also had journal entries and CDs
| to remember the experience and the music. It's not important to
| me where/when I saw _Everything, Everywhere, All At Once_ in
| theaters, but I have it on DVD and I _can 't wait_ to show it
| to my niece when she's older. And sure, I saw Amigo the Devil
| live, but frankly, he's not an artist you need to see in
| concert--the greatest impact of _Cocaine and Abel_ [4] on me
| was when I listened to it alone in my room. The ticket stubs
| simply don't matter to me.
|
| [1]
| https://www.viridiandesign.org/notes/451-500/the_last_viridi...
|
| [2]
| https://www.ted.com/talks/brene_brown_the_power_of_vulnerabi...
|
| [3] https://digital.wpi.edu/pdfviewer/wm117p10z
|
| [4] https://www.youtube.com/watch?v=ZzjtLm0G49E
|
| EDIT: All the things linked above, I have backed up in one form
| or another. Notably, the Schutt paper isn't at its original
| URL.
| asimpletune wrote:
| It reminds me of the cool links page I see now and then.
| mxuribe wrote:
| [delayed]
| smitelli wrote:
| > I got a picture of my great grandfather, thing took six hours
| to take your picture. [...] Every guy had one picture back then.
| And it's just him like, "[grimacing] I gotta get back, feed them
| hogs!" Now, in the future of course it'll be different. 50 years
| from now, people will be going like, "Hey! You wanna see a
| hundred thousand pictures of my great grandfather? I got 'em
| right here plus everything he did every day of his life." --Norm
| Macdonald[1]
|
| There is certainly a quantity of stuff online that is absolutely
| worth saving, but there's a considerably larger proportion that's
| just redundant to the point of being unremarkable and pointless.
| The trick is filtering, which can be capital-H Hard. That's why
| some may want to err on the side of over-collecting to reduce the
| possibility of missing something that will actually be important
| someday.
|
| [1]: https://www.youtube.com/watch?v=sY6SjMITHrQ
| nytesky wrote:
| Another funny take from Macfarlan
|
| Definitely no smiling:
|
| https://youtu.be/8SslNMLO0tw
| diggan wrote:
| Yeah, this is a good point. Isn't it better we save too much,
| as tooling for filtering stuff out will always get better,
| rather than saving too little? The latter has no workaround
| (today at least).
| nilamo wrote:
| Personally, I like that the internet is ephemeral. It matches
| real life in that way. I would rather see the internet as a means
| of connecting people over large distances (across space, Mars,
| etc), maintaining 20,000 copies of every irrelevant thing is just
| silly.
| qwertox wrote:
| > Personally, I like that the internet is ephemeral.
|
| It is not. It is only for us normal people. But the companies
| which log our lives in order to then capitalize on it, for them
| the internet is not ephemeral. They have copies of videos,
| pages, podcasts, whatever it is what can be found there.
|
| Why would you want those companies to know more about yourself
| than you do?
| zamadatix wrote:
| Archive.org or Google can cache more of the internet than I
| do while still having the majority of the content be
| ephemeral.
|
| I'd also hazard to guess most people in this camp would want
| these companies to also not store these things the same as
| they don't want people to.
| lxgr wrote:
| The problem is that not everything it has replaced was
| originally ephemeral.
|
| In a the Internet is both too ephemeral (self-hosted blogs
| disappear, Youtube videos get taken down) and too persistent at
| the same time; I don't think that most Twitter posts of non-
| public figures would need to remain public forever by default,
| for example, and I don't think I need to mention various data
| breaches.
|
| The Internet Archive somewhat mitigates the first issue, but it
| makes me pretty nervous that there's essentially just one
| organization doing what used to be much more distributed to
| various physical libraries.
|
| For the second one, I hope we'll see better solutions (both
| technical and social) as the technology and our interactions
| with it mature.
| paulpauper wrote:
| Digital storage is free; yes, save it all
| lxgr wrote:
| Please do share where I can reliably store my backups for free!
| fragmede wrote:
| > Backups are for wimps. Real men upload their data to an FTP
| site and have everyone else mirror it.
|
| -- Linus Torvalds
| LinuxBender wrote:
| This does still happen. Microsoft may nuke a git repo and
| someone has to figure out who has the latest version of the
| entire repo with all the latest commits of every branch.
| theandrewbailey wrote:
| The vast majority of people aren't privileged enough to
| have anyone mirror their data.
| lxgr wrote:
| But how do I get everyone to mirror my gigabytes of
| encrypted photo backups?
| paulpauper wrote:
| just upload them to social media accounts. Afik twitter,
| facebook, and youtube do not have storage limits . no
| deletion for inactivity either.
| paulpauper wrote:
| dump it on Wikipedia. afik wiki never removes anything. it
| just gets buried in an edit history . or Wikimedia image
| files
| Viktoire wrote:
| When I save things, I try to make sure that it'll be immediately
| useful to me once I find it again.
|
| I'll highlight, summarise and take notes of what I save. Or some
| combination of those. If I don't find anything new or directly
| applicable to my life, I'll let it pass by.
|
| This approach isn't good for archival purposes, but I hesitate to
| save a lot of things that I'll never read again.
| ghaff wrote:
| I'm going through my file cabinets right now. I'll keep a few
| things that catch my eye but I'll likely throw out most of it.
| The odd 25 year old computer magazine is probably interesting
| but not all of them collectively for the most part. And I'm
| certainly not going to index them in a way that they'd be
| useful to me.
| galleywest200 wrote:
| You can probably sell or donate those old magazines to a
| collector, or a kid interested in that stuff. At the very
| least drop them off at a thirft store instead of just dumping
| them.
| ghaff wrote:
| Thrift stores don't want a ton of old paper. There are a
| lot of things that someone somewhere would probably like
| but I'm not going to track them down or get them there.
| Mostly it's not magazines anywway. It's a bunch of articles
| I ripped out over the years.
|
| The one thing I have in my garage I know _someone_ would
| want is a big pile of laserdiscs. But, again, a thrift shop
| (or my library) wouldn 't want them and I live pretty far
| out from a major city. Probably will try Craigslist post-
| winter though as I'm trying to declutter.
| stared wrote:
| I often find myself revisiting old posts and stories. As with any
| human artifacts, most things aren't worth revisiting or are only
| meaningful in the moment. If they're gone, few people miss them.
|
| I'm a link hoarder myself (over 13k links on Pinboard:
| https://pinboard.in/u:pmigdal/). While I don't revisit most of
| them, some have proven invaluable for re-reading and sharing. I'm
| not sure about the typical half-life of internet content, but a
| lot disappears--whether because people stop paying for domains,
| official websites get reorganized (or their content removed), or
| other reasons.
|
| This is where the Internet Archive steps in, doing the essential
| work of a digital librarian. I often share links from its Wayback
| Machine, which has been a link-saver more times than I can count.
| swayvil wrote:
| Curve smoothing. Chaikin's algorithm and Jarek's tweak etc. Very
| clever and nice way of making angular geometry curvy.
| Constructive geometry stuff.
|
| There were like a dozen algs. I kept links to nice papers with
| diagrams. Then they started disappearing. Now I'd be pressed to
| find 2.
|
| This is really useful info that is apparently disappearing. So
| yes, it happens, and maybe you should save that stuff.
| RajT88 wrote:
| Stuff online is absolutely worth saving. It is a window into the
| past - what people concerned themselves with, what they loved and
| hated.
|
| Scholars will write papers on this era, speculating what it was
| like and how it fit into what came after.
|
| The web documents the massive societal changes underway which do
| not relate to the internet directly. Things like changes in
| transportation technology, medicine, sexuality and gender, and
| how your average people felt about all of it. Scholars will data
| mine those opinions to understand who felt what ways and why,
| with the benefit of hindsight. New knowledge will come of it.
|
| So yeah! It is all worth saving.
| greatgib wrote:
| Some times you have strange obsessions or a strange mindset
| related to your technological habits. And you might easily think
| that it is only you that is weird, not thinking straight. If you
| are the only one doing something, you are probably wrong.
|
| And then, hopefully, there are nice personal blog posts like this
| one, showing you that you are not alone having some peculiar
| habits and so that it might make sense even if most people don't
| even think about it.
|
| I have the exact same feeling when I discover through hn, blog
| posts and events that I'm not the only one having my web browsers
| full of tabs. Literally having thousand of tabs.
| thefaux wrote:
| There are many things in life that have immense personal value
| and zero value to nearly everyone else. This creates a lot of
| misunderstanding and incentive misalignment.
| impure wrote:
| The rise of LLM's has really devalued saving stuff online. What
| is the point of saving an article if I could just ask ChatGPT to
| created it and would probably do a pretty good job? It's still
| worth keeping notes and stuff that may be hard to find but the
| majority of things online can easily be reproduced and are not
| worth saving.
___________________________________________________________________
(page generated 2024-12-21 18:00 UTC)