[HN Gopher] ArchiveBox: Open-source self-hosted web archiving
___________________________________________________________________
ArchiveBox: Open-source self-hosted web archiving
Author : mieubrisse
Score : 58 points
Date : 2024-01-11 16:14 UTC (6 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| loceng wrote:
| Are there any figures available anywhere as to how many people
| actively-passively maintain a personal-private archive?
| valsk wrote:
| This was created 5 years ago..
| codsane wrote:
| And it's still being maintained :)
| dundarious wrote:
| I researched various archiving alternatives for something I
| needed recently. I subscribe to a paid Substack for an
| educational course that will end mid-year, and I want to archive
| the course posts before it ends (the course provider has even
| recommended people end their Substack subscription after it
| ends).
|
| For this purpose, I found the SingleFile browser extension to be
| the best fit. It's a browser extension, so paywall cookies are
| already present, and I just manually archive the previous week's
| content, _after_ the discussion phase has concluded. It creates a
| single self-contained file with all images and comments, etc.,
| but all non-page-local links still resolve externally (which is
| as-desired, for my use case). It can be configured to auto-
| generate a convenient filename, and to use self-extracting
| compression.
|
| I preferred this to an automated process based on, e.g., RSS,
| because I can ensure the archive occurs _after_ all the useful
| course comments back-and-forth has concluded, and it 's trivial
| to set up and use.
| kornhole wrote:
| That is a great solution for local copies. Archivebox is on a
| web server to make the archives available to anyone on the
| internet.
| kornhole wrote:
| I spun up my own Archivebox after archive.org wouldn't let me
| archive some news stories and I heard about them removing other
| content. Instead of calling the Internet Archive the wayback
| machine, I now call it the maybe back machine. IA is a
| centralized service and subject to the government and other
| powerful pressures any centralized popular service faces. If you
| want to archive something that might now or in future want to be
| erased by people in power, you should decentralize it to
| somewhere like an archivebox. This is especially useful if you
| are writing a book with many citations.
___________________________________________________________________
(page generated 2024-01-11 23:00 UTC)