[HN Gopher] ArchiveBox: Open-source self-hosted web archiving
       ___________________________________________________________________
        
       ArchiveBox: Open-source self-hosted web archiving
        
       Author : mieubrisse
       Score  : 58 points
       Date   : 2024-01-11 16:14 UTC (6 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | loceng wrote:
       | Are there any figures available anywhere as to how many people
       | actively-passively maintain a personal-private archive?
        
       | valsk wrote:
       | This was created 5 years ago..
        
         | codsane wrote:
         | And it's still being maintained :)
        
       | dundarious wrote:
       | I researched various archiving alternatives for something I
       | needed recently. I subscribe to a paid Substack for an
       | educational course that will end mid-year, and I want to archive
       | the course posts before it ends (the course provider has even
       | recommended people end their Substack subscription after it
       | ends).
       | 
       | For this purpose, I found the SingleFile browser extension to be
       | the best fit. It's a browser extension, so paywall cookies are
       | already present, and I just manually archive the previous week's
       | content, _after_ the discussion phase has concluded. It creates a
       | single self-contained file with all images and comments, etc.,
       | but all non-page-local links still resolve externally (which is
       | as-desired, for my use case). It can be configured to auto-
       | generate a convenient filename, and to use self-extracting
       | compression.
       | 
       | I preferred this to an automated process based on, e.g., RSS,
       | because I can ensure the archive occurs _after_ all the useful
       | course comments back-and-forth has concluded, and it 's trivial
       | to set up and use.
        
         | kornhole wrote:
         | That is a great solution for local copies. Archivebox is on a
         | web server to make the archives available to anyone on the
         | internet.
        
       | kornhole wrote:
       | I spun up my own Archivebox after archive.org wouldn't let me
       | archive some news stories and I heard about them removing other
       | content. Instead of calling the Internet Archive the wayback
       | machine, I now call it the maybe back machine. IA is a
       | centralized service and subject to the government and other
       | powerful pressures any centralized popular service faces. If you
       | want to archive something that might now or in future want to be
       | erased by people in power, you should decentralize it to
       | somewhere like an archivebox. This is especially useful if you
       | are writing a book with many citations.
        
       ___________________________________________________________________
       (page generated 2024-01-11 23:00 UTC)