[HN Gopher] Borg - Deduplicating archiver with compression and e...
___________________________________________________________________
Borg - Deduplicating archiver with compression and encryption
Author : rubyn00bie
Score : 112 points
Date : 2025-07-20 02:36 UTC (20 hours ago)
(HTM) web link (www.borgbackup.org)
(TXT) w3m dump (www.borgbackup.org)
| creamyhorror wrote:
| I remember using Borg Backup before eventually switching to
| Duplicati. It's been a while.
| Snild wrote:
| I currently use borg, and have never heard of Duplicati. What
| made you switch?
| racked wrote:
| I've had an awful experience with Duplicati. Unstable,
| incomplete, hell to install natively on Linux. This was 5 years
| ago and development in Duplicati seemed slow back then. Not
| sure how the situation is now.
| creamyhorror wrote:
| Interesting to hear. I use Duplicati on Windows and it's been
| fine, though I haven't extensively used its features.
| jszymborski wrote:
| Likewise. The ETA for the restore of my 500Gb HDD was like
| 100+ years or something. It's what caused me to ditch it for
| borg.
| toenail wrote:
| Last time I checked the deduplication only works per host when
| backups are encrypted, which makes sense. Anyway, borg is one of
| the three backup systems I use, it's alright.
| arendtio wrote:
| Which are the others?
| guerby wrote:
| https://kopia.io/
| toenail wrote:
| backuppc and a shell script using rsync, for backups to usb
| sticks
| ElectronBadger wrote:
| I using it with via Vorta (https://vorta.borgbase.com) frontend.
| My favorite backup solution so far.
| Kudos wrote:
| Pika Backup (https://apps.gnome.org/PikaBackup/) pointed at
| https://borgbase.com is my choice.
| blablabla123 wrote:
| I once met the Borg author at a conference, pretty chill guy. He
| said that when people file bugs because of data corruption, it's
| because his tool found the underlying disk to be broken. Sounds
| quite reliable although I'm mostly fine with tar...
| vrighter wrote:
| I used to work on backup software. I lost count of the number
| of times this happened to us with our clients too
| ValentineC wrote:
| I used CrashPlan in 2014. Back then, their implementation of
| Windows's Volume Shadow Copy Service (VSS) was buggy, and I
| lost data because of that. I doubt my underlying disk was
| broken.
| im3w1l wrote:
| While saying "hardware issue not my fault not my problem" is a
| valid stance, I'm thinking that if you hear it again and again
| from your users, maybe you should consider if you can do more.
| Verify the file was written correctly is a low hanging fruit.
| Other possibilities is run some s.m.a.r.t. check and show
| warning, or adding redundancy to recover from partial failure.
| ddtaylor wrote:
| I think the failure mode that is happening for users/devs
| here is bit rot. It's not that the device won't report back
| the same bytes, even if you disable whatever caching is
| happening, it's that after T amount of time it will report
| the wrong bytes. Some file systems have "scrubs" and stuff
| they do to automatically find these and sometimes attempt to
| repair them (ZFS can do this).
| thangngoc89 wrote:
| I switched to restic (https://restic.net/) and the backrest webui
| (https://github.com/garethgeorge/backrest) for Windows support.
| Files are deduplicated across machines with good compression
| support.
| sureglymop wrote:
| I also use restic and do backups to append-only rest-servers in
| multiple locations.
|
| I also back up multiple hosts to the same repository, which
| actually results in insane storage space savings. One thing I'm
| missing though is being able to specify multiple repositories
| for one snapshot such that I have consistency across the
| multiple backup locations. For now the snapshots just have
| different ids.
| linsomniac wrote:
| >back up multiple hosts to the same repository
|
| I haven't tried that recently (~3 years), does that work with
| concurrency or do you need to ensure one backup is running at
| a time? Back when I tried it I got the sense that it wasn't
| really meant to have many machines accessing the repo at
| once, and decided it was probably worth wasting space but
| having potentially more robust backups. Especially for my
| home use case where I only have a couple machines I'm backing
| up. But it'd be pretty cool if I could replace my main backup
| servers (using rsync --inplace and zfs snapshots) with restic
| and get deduplication.
| l33tman wrote:
| The issue with this is that if someone hacks one of the
| hosts now they have access to the backups of all your other
| hosts. With borg at least and the standard setup, would be
| cool if I was wrong though
| sureglymop wrote:
| At least with restic that is not an issue. See my other
| comment here:
| https://news.ycombinator.com/item?id=44626515
|
| Backups are append only and each host gets its own key,
| the keys can be individually revoked.
|
| Edit: I have to correct myself. After further research,
| it seems that append-only != write-only. Thus you are
| correct in that a single host could possibly access/read
| data backed up by another host. I suppose it depends on
| use-case whether that is a problem.
| sureglymop wrote:
| It works. In general, multiple clients can back up
| to/restore from the same repository at the same time and do
| writes/reads in parallel. However, restic does have a
| concept of exclusive and non-exclusive locks and I would
| recommend reading the manual/reference section on locks. It
| has some smart logic to detect and clean up stale locks by
| itself.
|
| Locks are created e.g. when you want to forget/prune data
| or when doing a check. The way I handle this is that I use
| systemd timers for my backup jobs. Before I do e.g. a check
| command I use an ansible ad-hoc command to pause the
| systemd units on all hosts and then wait until their
| operations are done. After doing my modifications to the
| repos I enable the units again.
|
| Another tip is that you can create individual keys for your
| hosts for the same repository. Each host gets its own key
| so that host compromise only leads to that key being
| compromised which can then be revoked after the breach. And
| as I said I use rest-servers in append-only mode so a
| hacker can only "waste storage" in case of a breach. And I
| also back up to multiple different locations (sequentially)
| so if a backup location is compromised I could recover from
| that.
|
| I don't back up the full hosts, mainly application data. I
| use tags to tag by application, backup type, etc. One pain
| point is, as I mentioned, that the snapshot IDs in the
| different repositories/locations are different. Also,
| because I back up sequentially, data may have already
| changed between writing to the different locations. But
| this is still better than syncing them with another tool as
| that would be bad in case one of the backup locations was
| compromised. The tag combinations help me deal with this
| issue.
|
| Restic really is an insanely powerful tool and can do
| almost everything other backup tools can!
|
| The only major downside to me is that it is not available
| in library form to be used in a Go program. But that may
| change in the future.
|
| Also, what would be even cooler for the multiple backup
| locations, is if the encrypted data could be distributed
| using e.g. something like shamir secret sharing where you'd
| need access to k of n backup locations to recreate the
| secret data. That would also mean that you wouldn't have to
| trust whatever provider you use to back up to (e.g. if it's
| amazon s3 or something).
| jeltz wrote:
| One big advantage of using restic is that its append only
| storage actually works unlike for Borg where it is just a hack.
| rollcat wrote:
| I've been using it for ~10 years at work and at home. Fantastic
| software.
| kachapopopow wrote:
| Restic is far better both in terms of usability and packaging
| (borgmatic pretty much is a requirement for usability). Have used
| both extensively, you can argue that borg can just be scripted
| instead and is a lot more versitile, but I had a much better
| experience with restic in terms of setup and forget. I am not
| scared that restic will break, with borg I did.
|
| Also not sure why this was posted, did a new version release or
| something?
| mekster wrote:
| How is the performance for both?
|
| Last time I used restic a few years ago, it choked on not so
| large data set with high memory usage. I read Borg doesn't
| choke like that.
| homebrewer wrote:
| Depends on what you consider large; I looked at one of the
| machines (at random), and it backups about two terabytes of
| data spread across about a million files. Most of them aren't
| changing day to day. I ran another backup, and restic
| rescanned them & created a snapshot in exactly 35 seconds,
| using ~800 MiB of RAM at peak and about 600 on average.
|
| The files are on HDD, and the machine doesn't have a lot of
| RAM, looking at high I/O wait times and low CPU load overall,
| I'm pretty sure the bottleneck is in loading filesystem
| metadata off disk.
|
| I wouldn't backup billions of files or petabytes of data with
| either restic or borg; stick to ZFS for anything of this
| scale.
|
| I don't remember what the initial scan time was (it was many
| years ago), but it wasn't unreasonable -- pretty sure the
| bottleneck also was in disk I/O.
| kmarc wrote:
| > you can argue that borg can just be scripted
|
| And that's what I did myself. Organically it grew to ~200
| lines, but it sits in the background (created a systemd unit
| for it, too) and does its job. I also use rclone to store the
| encrypted backups in an AWS S3 bucket
|
| I so much forget about it that sometimes I have to remind
| myself to test it out if it still works (it does).
| Original size Compressed size Deduplicated size
| All archives: 2.20 TB 1.49 TB
| 52.97 GB
| bjoli wrote:
| Pika backup is pretty darn simple.
| jszymborski wrote:
| I use Vorta, which makes Borg use very easy.
|
| https://vorta.borgbase.com/
| johng wrote:
| Emborg is also really cool:
| https://emborg.readthedocs.io/en/stable/
| sunaookami wrote:
| Love borg, use it to backup all my servers and laptop to a
| Hetzner Storage Box. Always impressed with the deduplication
| stats!
| stevekemp wrote:
| Same story here, using Borg with a Hetzner storage box to give
| me offsite backups.
|
| Cheap, reliable, and almost trouble-free.
| AnonC wrote:
| I've been looking at this project occasionally for more than four
| years. The development of version 2.0 started sometime in April
| 2022 (IIRC) and there's still no release candidate yet. I'm
| guessing that it'll be finished in a year from now.
|
| What are the current recommendations here to do periodic backups
| of a NAS with lower (not lowest) costs for about 1 TB of data
| (mostly personal photos and videos), ease of use and robustness
| that one can depend on (I know this sounds like a "pick two"
| situation)? I also want the backup to be completely private.
| homebrewer wrote:
| You definitely should have checksumming in some form, even if
| compression and deduplication are worthless in this particular
| use case, so either use ZFS on both the sending and the
| receiving side (most efficient, but probably will force you to
| redo the NAS), or stick to restic.
|
| I've been mostly using restic over the past five years to
| backup two dozen servers + several desktops (one of them
| Windows), no problems so far, and it's been very stable in both
| senses of the word (absence of bugs & unchanging API -- both
| "technical" and "user-facing").
|
| https://github.com/restic/restic
|
| The important thing is to run periodic scrubs with full data
| read to check that your data can actually be restored (I do it
| once a week; once a month is probably the upper limit).
| restic check --read-data ...
|
| Some suggestions for the receiver unless you want to go for
| your own hardware:
|
| https://www.rsync.net/signup/order.html?code=experts
|
| https://www.borgbase.com
|
| (the code is NOT a referral, it's their own internal thingy
| that cuts the price in half)
| rjh29 wrote:
| People like to recommend restic but I stay with Borg because it
| is old, popular and battle tested. Very important when dealing
| with backing up data!
| muppetman wrote:
| Restic is hardly new and untested? I don't think they're
| dissimilar in age. Restic is certainly battle tested. Are you
| thinking of rustic?
| rjh29 wrote:
| It's at least 5 years older, it's not 1.0 yet, and it seems
| to be still under heavy development. For example compression
| was only added in 2022 and people reported severe performance
| issues / high RAM usage with larger backups only a few years
| ago.
|
| Fair point though, both have enough of a user base that they
| could be considered safe at this point.
| TacticalCoder wrote:
| I'll die on this hill... If may files that are named like this:
| DSC009847.JPG
|
| were actually named like this:
| DSC009847-b3-73ea2364d158.JPG
|
| where "-b3-" means "what's coming before the extension are the
| first x bits (choose as many hexdigits as you want) of the Blake3
| cryptographic hash of the file...
|
| We'd be living in a better world.
|
| I do that for _many_ of my files. Notably family pictures and
| family movies, but also _.iso_ files, tar /gzip'ed files, etc.
|
| This makes detecting bitflips trivial.
|
| I've create little shellscripts for verification, backups, etc.
| that work with files having such a naming scheme.
|
| It's bliss.
|
| My world is a better place now. I moved to such a scheme after I
| had a series of 20 pictures from vacation with old friends that
| were corrupted (thankfully I had backups, but the concept of
| "determining which one is the correct file" programmatically is
| not _that_ easy).
|
| And, yes, it detected one bitflip since I'm using it.
|
| I don't always verify all the checksums: but I've got a script
| that does random sampling... It picks x% of the files with such a
| naming scheme and verifies the checksum of these x% of files
| picked randomly.
|
| It's not incompatible with ZFS: I still run ZFS on my Proxmox
| server. It's not incompatible with restic/borg/etc. either.
|
| This solves so many issues, including the _" How do you know your
| data is correct?"_ (answer is: _" Because I've already looked
| that family movie after the cryptographic hash was added to its
| name"_).
|
| Not a panacea but doesn't hurt and it's _really_ not much work.
| homebrewer wrote:
| It's an old idea and is also how some anime fansub groups
| prepare their releases: the filename of each episode contains
| the CRC32 of the file inside [square brackets].
|
| Doesn't really make much sense for BitTorrent uploads (which
| provides its own much stronger hashes), it's a holdover from
| the era of IRC bots.
| networked wrote:
| I prefer DSC009847.JPG.b3sum
|
| sidecar files [1] or per-directory checksum files like
| B3SUMS
|
| because they can be verified with standard tools. This scheme
| also allows you to checksum files whose names you can't or
| don't want to change. (Though in that situation you have an
| alternative of using a symlink for either the original name or
| the name with the checksum.) I have used the scheme less since
| I adopted ZFS.
|
| I do use very similar _example.com /foo/bar/b3-abcd0123.html_
| for https://example.com/foo/bar in the archival tool for
| outgoing links on my website. It avoids the need to have a date
| prefix like in the Wayback Machine while preventing
| duplication.
|
| Speaking of _.iso_ files. A recent PR [2] to my favorite Linux
| USB-disk-image burning tool Caligula has added support for
| detecting and verifying sidecar files like _foo.iso.sha256_
| (albeit not Blake).
|
| [1] https://en.wikipedia.org/wiki/Sidecar_file
|
| [2] https://github.com/ifd3f/caligula/pull/186
| bjoli wrote:
| They are also a prominent user of aes-ocb iirc.
| dxs wrote:
| Also: Baqpaq
|
| "Baqpaq takes snapshots of files and folders on your system, and
| syncs them to another machine, or uploads it to your Google Drive
| or Dropbox account. Set up any schedule you prefer and Baqpaq
| will create, prune, sync, and upload snapshots at the scheduled
| time.
|
| "Baqpaq is a tool for personal data backups on Linux systems.
| Powered by BorgBackup, RSync, and RClone it is designed to run on
| Linux distributions based on Debian, Ubuntu, Fedora, and Arch
| Linux."
|
| At: https://store.teejeetech.com/product/baqpaq/
|
| Though personally I use Borg, Rsync, and some scripts I wrote
| based on Tar.
| evulhotdog wrote:
| Kopia is an awesome tool that checks the same boxes, and has a
| wonderful GUI if you need that.
|
| Not affiliated, just a happy user.
| jszymborski wrote:
| I've been using the Vorta GUI [0] and Hetzner's Storage Box
| service for ages and it works great. Has saved me from some
| headaches.
|
| I switched over from Duplicati a long while back when my laptop's
| sole HDD failed and Duplicati was giving me 143 year estimates
| for the restore to complete. This was true whether I aimed to
| restore the whole drive or just a single file.
|
| https://vorta.borgbase.com/
| johng wrote:
| Plakar is a new project out there that is interesting.... lots of
| cool stuff happening.
|
| https://plakar.io/
___________________________________________________________________
(page generated 2025-07-20 23:02 UTC)