[HN Gopher] BorgBackup: Deduplicating archiver with compression ...
___________________________________________________________________
BorgBackup: Deduplicating archiver with compression and encryption
Author : phil294
Score : 111 points
Date : 2022-12-27 19:01 UTC (3 hours ago)
(HTM) web link (www.borgbackup.org)
(TXT) w3m dump (www.borgbackup.org)
| samuell wrote:
| Have had a look at both Borg and Restic, but even Restic which is
| supposed to be faster iirc, was extremely slow on my computer.
|
| Been much more happy with my tries with https://kopia.io which
| also includes an optional cross-platform GUI, in addition to the
| CLI.
| beci wrote:
| I tried all these in my environment, about 2 years ago and
| kopia wins for me too. Is there any advantage of borg over
| kopia since then?
| aborsy wrote:
| Reliability. Borg has been around for a long time, and is far
| more mature.
|
| I wouldn't trust my backups to Kopia (unless for
| experimentation).
| nine_k wrote:
| Wow, Kopia looks pretty interesting. I first thought that it's
| a fork of Restic, but it appears to be independent. It has all
| the features that are key to me: encrypted, deduplicated, works
| on object storage, can mount the backup as a filesystem.
|
| On one hand, on may think that three programs with very similar
| approach and features is a waste of resource. On the other
| hand, this is what refinement of the idea looks like: each
| project improves over previous attempts.
| vbezhenar wrote:
| Can someone suggest an approach to backup container environment?
| E.g. running inside Kubernetes.
|
| As I see it: I write some kind of configuration.
|
| someproject-db is a deployment which runs a postgres db. Tool
| should connect to this DB, issue some kind of pg_backup command,
| capture output, retrieve some metadata about previous backup from
| S3, compute difference with previous run, compress that
| difference and store it to S3.
|
| anotherproject is a deployment which runs an sqlite db. Tool
| should do the same but with sqlite-specific commands.
|
| yetanotherproject-data is a pvc which has attached pv. Tool
| should find pod which mounted this volume, exec into that pod and
| retrieve pv data, again find different and store it to S3.
|
| Of course things should be configurable. Like store difference
| every 15 minutes, store complete backup every week and so on.
|
| I'm fine with manual recover and with manual configuration (I
| just don't want to write and test all the scripts myself).
|
| What I don't want is some kind of magic tool which will backup
| the entire cluster, etcd and my grandparents automatically in
| some magic way only for $50k/cpu core.
| m3nu wrote:
| I recently wrote up my strategy for backing up local containers
| with Borg & Borgmatic here:
| https://docs.borgbase.com/setup/borg/containers/
|
| Borgmatic will beautifully deal with DB dumps and there is a
| popular container image to run it. As for the cache ("retrieve
| some metadata about previous backup from S3"), you don't need
| to keep it locally. It can be restored from the backup
| repository.
|
| Hope some of this applies to your K8s setup.
| aborsy wrote:
| Database and VM snapshot and backup can be tricky.
|
| My suggestion is using ZFS.
| mekster wrote:
| Doesn't zfs kind of solve all the backup problems alone?
| Technically, no other backup tools can beat it as being the
| filesystem itself, it knows more than any external tools can
| know, like instantly know what file got changed over time
| without scanning the entire tree.
|
| I use Borg as backup of backup (zfs snapshots), so I'll be
| having multiple implementations of backups (also both are on
| different remote location) just to be on the safe side.
|
| I don't use any other fancier ones as I don't like risking
| data on less reliable tools.
| m3nu wrote:
| Remote ZFS replication kinda does. But the offsite backup
| wouldn't be encrypted and not everyone is using ZFS. So
| it's not for all situations.
| mekster wrote:
| You can send zfs encrypted volume as encrypted.
|
| How does it matter if anyone else is using zfs? You
| either use a service that supports zfs target or run your
| own Linux instance which is just installing a single
| package for Ubuntu.
| m3nu wrote:
| What I meant was that there are people who don't run ZFS,
| but still need backups. So it won't work for everyone.
|
| Even for my own use cases, not every server and system I
| maintain could use ZFS right away.
|
| Still good to know about the encrypted volume feature.
| Will be sure to test this next year.
| dpedu wrote:
| I'm using Velero to do this in my toy kubernetes clusters. It
| uses Restic under the hood and can store things into S3. By
| default it will take a filesystem-level copy of whatever is on
| a pv. It looks like it supports hooks, e.g. to run pg_backup
| like you mentioned, but I haven't used them.
|
| https://github.com/vmware-tanzu/velero
| seymon wrote:
| Is there something to backup helm releases? With including all
| k8s manifests, configmaps, secrets and also persistent volumes.
| Preferably FOSS?
| m3drano wrote:
| I recommend using borgmatic to ease the management of Borg
| backups.
| mtmail wrote:
| rsync.net has a special discount when you use borg and "you're an
| expert" https://www.rsync.net/products/borg.html
|
| We're looking to replace our self-written borg backup scripts
| with https://torsion.org/borgmatic/ which is a wrapper around
| borg.
| nine_k wrote:
| If you prefer a similar approach, but as a single compiled
| binary, there's Restic: https://github.com/restic/restic
|
| Update: yet _another_ take on basically the same approach, also
| as a self-contained binary: https://github.com/kopia/kopia
| kova12 wrote:
| I recall trying to use restic instead of borg a couple years
| ago, and some major feature was unavailable. I don't recall
| what is was, I think it was compression, which made archives
| quite large, and required larger instances for backup.
| [deleted]
| anotherevan wrote:
| It probably was compression. The good news is compression is
| now available with Restic!
| RockRobotRock wrote:
| Does anyone have a take on Kopia vs Restic?
| btschaegg wrote:
| A major factor I wouldn't want to use Kopia (I looked into
| it) ist that it is opinionated with regards to how your
| system is set up (old-school unix FS layouts in a "pets, not
| cattle" way). It assumes the location of config files and
| does not allow you to change the backup path that's stored in
| a snapshot's metadata.
|
| That's bad if you want to use it
|
| 1) on NixOS (I don't want backup configs laying around in
| `~/.config`). As Indy famously said: "That belongs in a Nix
| expression!"
|
| 2) with ZFS snapshots (yes, I'm backing up
| `/path/to/dataset/.zfs/snapshot/<timestamp>/foo/bar`, but
| that should not be its path in the metadata!)
|
| OTOH, it seems to have the upside that you can apparently
| alter snapshots after the fact more easily (e.g. if you find
| out you shouldn't have backed up that gigantic VM image you
| just moved somewhere temporarily). I leave the decision on
| whether this is a footgun or not to you.
|
| And to be clear: The ZFS snaphot thing is also a pain with
| Restic, too. You can hack around it somewhat better with
| something like systemd-nspawn, but it _really_ shouldn 't be
| that hard.
| somishere wrote:
| Big thumbs up for Kopia and its very simple GUI / strategy.
| Have been using it for a couple of years now to remote backup
| hard drives and working folders on a bunch of family macs to
| B2. Restored twice now - logic board & corrupt hd. Chose it
| after trialing both Borg and Restic for ease of use and storage
| cost. My monthly backblaze bill still hovers around $1.40.
| mekster wrote:
| In my book, backup tools don't count unless it has been used
| widely for a while without major issues repeatedly being
| reported.
|
| Kopia is too new for that state.
| Silhouette wrote:
| That's a reasonable point but the solution - as with all
| things backup - is diversification. Otherwise if everyone
| followed your logic then no new backup software or storage
| service could ever become established no matter how good it
| actually was.
|
| Given that all of the options being discussed here look
| technically better and in some cases more than an order of
| magnitude cheaper than other popular backup services and
| software discussed on HN in the past you could afford to run
| full redundant backups with multiple combinations of software
| and backing storage and still have more options for a much
| lower price than a few years ago.
| binaryanomaly wrote:
| restic is great and simple to use. Use it for archiving my
| backups to Google Cloud Storage.
| auxym wrote:
| restic also seems to have better Windows support.
|
| Borg can can run in WSL but has seen limited testing under
| such, per their own docs.
| wyatt_dolores wrote:
| I had to setup a quick backup to s3 storage to replace an aging
| rsnapshot setup. I looked at Borg, but Duplicity
| (https://duplicity.us/) was easier to configure and connect to
| S3.
|
| For syncing S3 storage across providers, I went with rclone
| (https://rclone.org/). Note that using rclone to sync across
| providers (e.g. from Amazon to Wasabi) does require the files to
| be downloaded to the client machine and then uploaded again. Not
| ideal, but if you have extra bandwidth it is a convenient setup.
| ThomasWaldmann wrote:
| I see quite some full backups in your near future. ;-)
|
| And that is one of the main reasons why chunk-deduplicating
| backup tools (like borg, restic, ...) are better than
| full/incremental style ones.
| bluedino wrote:
| Borg works great. I used it at a shop that had lots of servers,
| but didn't have any real backups-other than when someone would
| remember to go swap external USB drives and hope they actually
| ran.
|
| Set it up on a bunch of servers with a simple cron job, the
| initial backups went quickly, and the incrementals were really
| fast. Made great use of an old Dell server that wasn't doing
| anything else and had lots of slow disks in it.
| m3nu wrote:
| Borg is surprisingly fast and memory-efficient, even when
| compared to Restic, which is written in Go. Recently did a
| benchmark to test the upcoming Borg v2 and this surprised me
| the most:
|
| https://github.com/borgbase/benchmarks
| number6 wrote:
| I am always torn between the two: restic or Borg.. how would
| I decide?
| water554 wrote:
| Use them both I ended up at borg
| ThePhysicist wrote:
| Have been using Borg for many years now, it saved me several
| times already when I accidentally deleted stuff I realized I
| still needed later on. What's great is that you can just mount
| your backup repository as a FUSE filesystem, Borg then gives you
| a directory structure containing all your backups over time.
| Personally I use dates to name my backups, e.g. 2022-10-11, so
| when I need to restore something from a specific date I just go
| to the appropriate folder and extract it.
| dang wrote:
| Related:
|
| _Deduplicating Archiver with Compression and Encryption_ -
| https://news.ycombinator.com/item?id=27939412 - July 2021 (71
| comments)
|
| _BorgBackup: Deduplicating Archiver_ -
| https://news.ycombinator.com/item?id=21642364 - Nov 2019 (103
| comments)
| nov21b wrote:
| Just started using the append only feature to prevent a potential
| hacker from wiping out backups that live on a remote ssh server.
| Combined with restricted ssh access this can be made quite
| secure. I also tested writing backups to my Android phone (as a
| backup target) using Termux and Wireguard, worked flawlessly with
| a bit of tuning (keeping the vpn alive)
| pkulak wrote:
| Append only modes are brilliant. Is there an easy way to hook
| into something like Glacier Deep Archive? That would be super
| cost effective.
| trulyrandom wrote:
| I've been using Borg for years. It's great! The deduplication
| feature allows me to take a "full" backup of my work station
| _hourly_. Taking frequent backups like this has already saved my
| bacon a number times in cases where I accidentally mangled
| /deleted a file I didn't mean to touch.
|
| I recently stumbled upon the release notes for the (WIP) v2:
| https://www.borgbackup.org/releases/borg-2.0.html. Seems to
| address quite a few of the pain points of v1.
| mekster wrote:
| Too bad they couldn't target S3 endpoints or anything other
| than SSH for remote target on this breaking change or else it
| would've been the best of the bunch.
| notpushkin wrote:
| I'm wondering if there's a less painful way to use Borg with
| https://rclone.org/ than just maintaining a local Borg repo
| and then syncing that.
| m3nu wrote:
| Yepp. Version 2 got rid of lots of legacy code and cleaned up
| CLI args a bit. Will be around 5 to 20% faster than the v1.2
| branch.
|
| https://github.com/borgbase/benchmarks
| mustache_kimono wrote:
| Yeah, last time I tried, it was impressive, but kinda slow
| and limited to execution on a single core.
|
| Eager to take another look at Borg and Kopia, etc.
| beci wrote:
| Why do you miss kopia from your benchmark?
| trulyrandom wrote:
| Did you mean to reply to
| https://news.ycombinator.com/item?id=34153119?
| m3nu wrote:
| Probably. My benchmark was mostly to compare Borg v1.2 and
| v2 and some network optimizations. Restic was a stretch
| goal really.
|
| For Kopia, I do try it once a year, but I still find the
| docs and CLI args confusing. Running the server part behind
| a reverse proxy needs 2x HTTPS and searching the forum to
| get it somewhat working. For a webdav target, the progress
| display doesn't really work and it's not possible to cancel
| a backup run. So for now I'm observing and will retry next
| year.
| aborsy wrote:
| Borg is very good. The V2 repository format will bring in a lot
| of improvements, particularly in cryptography.
|
| Anyone knows when 2.0 will be out of beta, and stable?
| m3nu wrote:
| Likely next year after 1-2 RCs. It's at beta4 currently.
| haunter wrote:
| What's good for Windows 10 (NTFS) drives? I'm using the Veeam
| Agent free version [0] for years and no problems whatsoever but
| curious what are some good options
|
| 0, https://www.veeam.com/agent-for-windows-community-
| edition.ht...
| xupybd wrote:
| I use restic on Windows servers but for workstations I use
| backblaze. They have a backup client. It's just too easy. I
| don't have to think about it.
| k8sToGo wrote:
| For Windows Image Backups I use macrium
| sleepytimetea wrote:
| Python source code ? No cloud native API integration ? UI?
| non-nil wrote:
| There's Vorta: https://github.com/borgbase/vorta which I quite
| like.
| eointierney wrote:
| Deffo recommend Vorta, good ui, very reliable
| eternityforest wrote:
| Vorta looks really awesome, maybe awesome enough that I might
| switch from Back in Time.
| SoftTalker wrote:
| Maybe a bit off topic, but what is a good utility for "imaging" a
| linux system. I have a task to reprovision a system but we want
| to keep a complete backup of the current system so that it's
| possible to restore completely as if it were never touched.
|
| This is more than just data backup as we would need need to
| recover disk partitions/LVM metadata, boot records, etc. as well
| as all the data itself.
| av8avenger wrote:
| Take a look at Clonezilla. Used it many times for the exact
| same purpose. You could run it either on a running system or
| use the live iso they provide.
|
| https://clonezilla.org/
| eointierney wrote:
| Clonezilla is awesome, fast, stable, flexible, and reliable,
| from the Taiwan Supercomputing Centre. I used it lots over a
| decade ago to manage Mac, Windows, and Linux workstations and
| servers.
| akerl_ wrote:
| Do you need to do this once, or 10 times, or 1000 times? How
| big are the disks?
|
| The most boring answer is "connect the disks to something else
| and use `dd` to copy the full blocks from start to finish into
| a file".
| SoftTalker wrote:
| For this specific need, just once. Disk is 1TB but only about
| 350GB used.
|
| 'dd' would have been my thought as well, I've heard of
| Clonezilla also but never used it and not sure it's really
| doing anything appreciably different.
|
| I like the idea of 'dd' because I have a very clear mental
| picture of what it does. Just wasn't sure there was something
| else I might want to look at.
| dividuum wrote:
| If you need multiple version of such a disk image, a tool
| like restic (or I guess borg too, not sure?) can also
| compress what's provided to it via stdin. So you'd dd
| directly into restic and it will delta compress to earlier
| backups.
| akerl_ wrote:
| Yea; this was the heart of my 1/10/1000 question. Once?
| I'd probably just use dd and call it a day. 10 times?
| Probably download clonezilla. 1000 times? Probably
| automate something w/ restic and some kind of object
| storage layer so I don't just have a directory full of
| giant images/deltas somewhere.
| vbezhenar wrote:
| Dumbest approach is dd + compress.
|
| Slightly smarter approach is dd, then zero unused sectors and
| compress.
|
| Both will produce an image which could me restored with DD (or
| mounted offline). Second will be smaller.
|
| They should be run with unmounted partitions.
| andrewchambers wrote:
| I am the author of bupstash -
| https://github.com/andrewchambers/bupstash which has many
| advantages over borg in my biased opinion (like air gapped
| decryption keys and better performance). Feel free to check it
| out.
___________________________________________________________________
(page generated 2022-12-27 23:00 UTC)