[HN Gopher] Kopia - Fast and Secure Open-Source Backup
       ___________________________________________________________________
        
       Kopia - Fast and Secure Open-Source Backup
        
       Author : chetangoti
       Score  : 132 points
       Date   : 2021-06-11 11:40 UTC (11 hours ago)
        
 (HTM) web link (kopia.io)
 (TXT) w3m dump (kopia.io)
        
       | encryptluks2 wrote:
       | Can I get some recent performance benchmarks vs restic? I only
       | recently switched to restic and for the most part it is great
       | (definitely an improvement from CrashPlan IMO), but it seems like
       | this might actually have a few improvements to make it worth
       | considering switching from restic.
        
         | ntolia wrote:
         | I ran some performance benchmarks not too long ago.
         | https://blog.kasten.io/benchmarking-kopia-architecture-scale...
        
           | brimstedt wrote:
           | Thanks.
           | 
           | It would have been interesting with restore times as well.
           | 
           | Quite important if it will take you an hour or a day to
           | restore..
           | 
           | L
        
       | rkalla wrote:
       | Any discussion about backup should always include a shout out to
       | one of the most beautiful versions of this: bvckup2
       | 
       | About 10 years ago the founder just decided to start writing the
       | most streamlined, beautiful pieces of software he could -
       | obsession around NTFS nuances to improve performance and reduce
       | overhead to an unbelievable degree.
       | 
       | 10 years later, it's still him (and maybe a few other folks) and
       | the software is unbelievably polished.
        
         | dsego wrote:
         | On Mac the most polished is Bombich's Carbon Copy Cloner, it's
         | a thing of beauty. I might go back to Mac just for that piece
         | of software alone.
        
         | Jiejeing wrote:
         | I believe you about polish and performance, but that solution
         | is neither open-source nor cross-platform (unless I failed to
         | find it on the website).
        
           | lucb1e wrote:
           | Nor free. The download button says a 2-week trial is
           | included.
           | 
           | Which is okay, but licensing is enough of a hassle when
           | restic etc. exist that I'm not going to bother with that for
           | my systems. A design goal of restic (sorry I'm just familiar
           | with that one, not affiliated with it) is also
           | recoverability: if your repository gets horribly corrupted or
           | you can't run the software easily anymore then the author
           | wanted to be able to recover things still. One of the early
           | talks (at a local CCC) explains how to decrypt things
           | manually in a few minutes -- obviously he knows what he's
           | doing and will be faster than me, but still. Having closed
           | source software as an alternative to that... I dunno.
        
             | Jiejeing wrote:
             | Yeah, I'm not complaining and it's entirely fair to sell
             | your work, but I'm always wary of relying on a closed-
             | source (+ licensed) tool for things as critical as backups.
             | I may be partial though, having interacted with the Borg
             | author in-person a few times at Congress, which convinced
             | me I could trust it to not shred my data.
        
         | poronski wrote:
         | Clickable - https://www.bvckup2.com
        
       | MikusR wrote:
       | It has a gui and works on Linux/Windows/MacOS. Also has both
       | deduplication and compression.
        
         | _def wrote:
         | Finally! I've been waiting patiently for a open source
         | x-platform solution that ticks those boxes. edit: ah,
         | nevermind. I just tried it and I'm sure it's great for online
         | backups. But not so well suited for backups on a plain usb hdd.
        
           | nsriv wrote:
           | Pretty sure it allows and works with local USB storage as
           | well, covered in documentation here:
           | https://kopia.io/docs/repositories/#local-storage.
        
       | forgotpwd16 wrote:
       | As a trivia, kopia means copy in Polish which is founder's mother
       | tongue.
        
       | kemonocode wrote:
       | Interesting, I'd be willing to give it a try after I had to
       | abandon Duplicati as it just seemed like a lost cause. Right now
       | my backup setup consists on a UrBackup server my machines connect
       | to, and a borg repository that is synced against the latest
       | backup and then sent remotely to S3. It works, but could
       | definitively use some streamlining...
        
       | tut-urut-utut wrote:
       | Looks interesting. I always liked the borg concept, but never
       | actually used it for backups because it lacks windows support and
       | a nice gui.
       | 
       | Will give it a try.
        
         | PopeRigby wrote:
         | Borg does have a GUI: https://vorta.borgbase.com/
         | 
         | Although it's subjective if you find it nice or not.
        
       | AaronFriel wrote:
       | How does this compare to Duplicacy in terms of throughput?
        
       | przemub wrote:
       | I love the new tendency to use Polish (and other languages')
       | names for programs!
        
         | pivic wrote:
         | Indeed! I'm from Sweden, where 'kopia' is Swedish for the
         | English word 'copy'. Same meaning in Polish?
        
           | DominikD wrote:
           | It means both "copy" and "lance" (as in: weapon used for
           | jousting), hence the logo.
        
         | Tade0 wrote:
         | The other day I heard someone use the word "zuk" referring to a
         | bug, and I think it fits amazingly well.
        
           | coolspot wrote:
           | Is it pronounced like Russian "Zhuk"?
        
             | Tade0 wrote:
             | Yes, exactly.
        
         | scns wrote:
         | https://github.com/qarmin/czkawka
         | 
         | Means hiccup
        
       | lucb1e wrote:
       | Another one? Ten years ago I had trouble finding anything, then
       | over time I learned of duplicity (2002?!1), bup (2010), restic
       | (2015), borg (2010)... all basically solving the same problem of
       | encrypted incremental backups.
       | 
       | The landing page doesn't mention why they made yet another
       | solution. In the comments someone also mentions bvckup2. Is there
       | an overview somewhere of all the different solutions? Any selling
       | points here?
       | 
       | 1 That's way older than I expected, but it's also the first one I
       | found and I stopped using it because it took many gigabytes on my
       | then-250GB SSD for a local cache just to be able to do
       | incremental backups. Maybe that's why I had the feeling I
       | couldn't find anything good at the time.
        
         | MikusR wrote:
         | It has a gui and works on Linux/Windows/MacOS. Also has both
         | deduplication and compression.
        
           | lucb1e wrote:
           | Thanks!
           | 
           | Still a bit strange to me that they started a whole new
           | project rather than contribute patches or fork something
           | else.
           | 
           | The only one I'm fairly familiar with is restic. It also
           | compiles for the major OSes and has deduplication. For
           | compression, I think there are patches available, but
           | alternatively they could just have contributed one. That
           | leaves tying the command-line interface to a few buttons in a
           | GUI.
           | 
           | Edit: Kopia turns out to be from 2016. I guess the author
           | didn't know of the others yet, or the others weren't as
           | mature. This makes a lot more sense, somehow the .io domain
           | and my first hearing of it only now made me expect this was
           | written recently.
        
             | MikusR wrote:
             | Restic issues for adding GUI and compression are 7 years
             | old. If if there was interest/"easy to do" for implementing
             | them that would have been done.
        
               | lucb1e wrote:
               | What I meant is that the author of Kopia could have done
               | that and made their life a lot easier compared to
               | starting all the way from scratch. But I posted that
               | before realizing that Kopia is barely a year younger than
               | Restic.
               | 
               | But of course it could also just be what u/poronski said
               | in a sibling comment. His Noodly holiness knows I make a
               | lot of software that already exists just because I enjoy
               | the making and having it customized. In fact I think I
               | also started... let me `stat` that directory... yup, in
               | 2016 I started working on (and abandoned) my own
               | implementation of encrypted backups. Also because online
               | storage prices were through the roof (~40x the hardware
               | cost price with servers and bandwidth included) and I
               | thought I could do that cheaper.
        
             | poronski wrote:
             | To each their own.
             | 
             | For a lot of people it's way more fun to make a new thing
             | than to fork and patch someone else's.
        
         | rglover wrote:
         | Easy on the gas, bud.
        
       | klodolph wrote:
       | I have been using Kopia for some time now after switching from
       | Duplicity. Very happy with Kopia. You can just point Kopia at a
       | GCS or S3 bucket and shove files there. Easy to restore files.
       | You can list snapshots and files and do partial restores fairly
       | easily. Old data gets expired on a timeline that you dictate.
       | 
       | Duplicity was a pain by comparison. I think Duplicity has a
       | number of design flaws that become evident once you use it for a
       | while.
        
       | foolinaround wrote:
       | how would this type of software compare against Dropbox /
       | NextCloud?
        
         | jbnorth wrote:
         | They're fundamentally different in the problems they solve.
         | Dropbox is a cloud file storage and syncing service. NextCloud
         | is an open source alternative to something like Dropbox in that
         | it offers file sharing and syncing but also much more on top of
         | that. It's really closer to something like the Google suite of
         | personal cloud services with Google Drive, Photos, Contacts,
         | and Calendar. Kopia is a backup solution for the files on your
         | computer. You can use cloud file storage providers as the
         | destination for these backups but it doesn't handle the storage
         | of the backups itself. You have to provide that storage to it.
        
       | z77dj3kl wrote:
       | The only thing I need (and is sorely missing from Restic) is that
       | the metadata be kept separate from the actual data. That way I
       | can store the data in AWS S3 Deep Glacier at a cost of nothing
       | per year, and still do incremental backups. Currently the
       | architecture of Restic for instance requires all data to be
       | quickly and cheaply accessible; which makes it impossible for
       | this.
       | 
       | I have terabytes of data that I'd be happy to dump encrypted and
       | compressed in Deep Glacier and happy to pay $500 to retrieve if I
       | were to mess up my hard drives, but otherwise don't want to pay
       | for the costs of normal S3.
       | 
       | Does Kopia separate metadata from the actual encrypted/compressed
       | blobs?
        
       | gingerlime wrote:
       | I wonder how it compares to restic or borg. Besides the gui
       | anyway...
        
         | benrockwood wrote:
         | Looks like a polished restic to me.
        
           | isbvhodnvemrwvn wrote:
           | Polished*
        
             | StavrosK wrote:
             | A hard joke to get, but I liked it.
        
         | ntolia wrote:
         | You can see some of the performance differences here -
         | https://blog.kasten.io/benchmarking-kopia-architecture-scale...
        
           | jeremyw wrote:
           | Note this compares an older restic version that doesn't
           | include the order of magnitude improvements in cloud
           | communication.
        
         | Scaevolus wrote:
         | Restic doesn't support compressing backups, and kopia does.
         | Otherwise, the architectures appear to be very similar.
        
           | lucb1e wrote:
           | How much space do you typically save by compressing these
           | days? Given that even smallish things like documents are
           | already compressed archives, pictures/audio/movies of course
           | already have heavy purpose-specific compression.
           | 
           | The main things I can still think of that are sparse on
           | purpose are database files and disk images (not very
           | mainstream, but also not uncommon). So like, a few gigabytes
           | per terabyte (a few promille) unless you're really heavy on
           | either databases or virtual machines?
           | 
           | I can see why one would like to enable it, but deduplication
           | (which breaks if you naively implement compression, iirc
           | that's why restic hasn't yet implemented it) is much more
           | worth it because it enables incremental backups and you
           | don't, for example, have to worry about making a copy of
           | another system that has many of the same files (think game
           | files or system files).
        
             | wmf wrote:
             | I assume compression works well for source code and other
             | developer artifacts. Obviously you have to do dedupe, then
             | compression, then encryption.
        
               | lucb1e wrote:
               | Source code compresses very well indeed, but my hunch is
               | that it's peanuts. Let's see, I've got a projects
               | directory with various projects from the past decade (all
               | custom, there's a separate dir for downloaded
               | repositories). I've mostly written things in Python and
               | PHP (the JS/CSS/HTML stuff is on a server mixed with
               | things like owncloud or SMF or so; harder to isolate).
               | 
               | PHP: 591 KiB, 196 files, 13'813 LOC, 1'236 comment lines.
               | 
               | Python: 345 KiB, 136 files, 7'760 LOC, 930 comment lines.
               | 
               | If someone spends 5 minutes of developer time trying to
               | compress that to save disk space, that's already not
               | worth it. Also in huge projects, the actual code is not
               | going to be taking gigabytes of space. And if you mean in
               | git history: that is, again, already compressed.
               | 
               | Other developer artifacts: I've got 26 GiB of project
               | directories, this time including downloaded software and
               | it will also include binaries (hashcat and jtr are in
               | there, I wouldn't be surprised if there's also a medium-
               | sized dictionary or two). Doing tar c . does not seem to
               | add much overhead (26.5 GiB). Compressing that stream
               | with pigz -1 (multithreaded gzip) brings it down to 17
               | GiB.
               | 
               | 35% off is better than I thought! I wonder which files
               | compress so well, hmm let me `find -type f | shuf | head
               | -9001 | while read line; do echo "$(($(wc -c
               | <"$line")-$(<"$line" pigz -1 | wc -c))) $line"; done |
               | sort -n`... The largest difference is a huge 121 MiB
               | binary that compresses down to 36 MiB. I didn't know
               | these files were so sparse (not a C(++) dev),
               | interesting!
               | 
               | While I'm looking into this, let's also look at my
               | "documents" directory. It's 43 GiB and compresses down to
               | 33 GiB. Not as good, but still worth it, more than I
               | thought! (And this compression isn't good, but probably
               | not more than 10% worse before the compression gets
               | impractically slow.) It might not quite get a total
               | backup size down by a disk size (e.g. not 1T down to
               | 500G), but it definitely allows to keep more history
               | before having to worry about what you want to keep and
               | what you want to toss.
        
           | aidenn0 wrote:
           | For local backups, compression is less of an issue; for
           | things that are compressible, transparent file system
           | compression seems to get about 110% the size of what I would
           | get by any non-CPU bound levels of compression using a tgz.
           | Since (as others in this thread have noted), compressible
           | files tend to also be smaller files (the only exception I can
           | think of would be if your log rotation doesn't compress old
           | logs), the fact that only a fraction of what I backup is 10%
           | larger is kind of "okay." When you're sending across the
           | network though it can be a big deal.
        
             | MikusR wrote:
             | On disk the backups are encrypted, that means no
             | transparent compression.
        
               | isbvhodnvemrwvn wrote:
               | For people who haven't dealt with this - a good
               | encryption scheme produces output which you can't tell
               | apart from a purely random stream of bits - it has very
               | high entropy, and is therefore not compressible.
        
         | poronski wrote:
         | Gotta say kopia does have a very strong restic vibe to it.
         | 
         | Not a bad thing, just means that restic managed to get lots of
         | things right.
        
       ___________________________________________________________________
       (page generated 2021-06-11 23:01 UTC)