tbackup-someone.txt - monochromatic - monochromatic blog: http://blog.z3bra.org
(HTM) git clone git://z3bra.org/monochromatic
(DIR) Log
(DIR) Files
(DIR) Refs
---
tbackup-someone.txt (10047B)
---
1 # Backup, someone ?
2
3 24 September, 2014
4
5 **FRIENDLY REMINDER: Have you back up your data today ?**
6
7 If you've never seen this sentence, then write it down, and put it somewhere
8 in evidence.
9
10 <q>Why ?</q> you ask ? Because. Having multiple copies of your data is important
11 if you plan on keeping them on the long term.
12 You know, a hard drive will not tell you: <q>Hey ! I'm gonna die in two days
13 around 2 am, please copy me somewhere else.</q>. There are so many way to loose
14 data... And you'll experience some of them, trust me !
15
16 Anyway, back to the topic ! In this post, I'm gonna tell you a *simple* way to
17 backup your data. All you need is the following:
18
19 * A external storage support (USB key, hard drive, tapes, ...)
20 * An archiver (cpio, tar, ar, ...)
21 * A compressor (gzip, bzip2, xz, ...)
22 * Some shell glue
23
24 ## Preparation
25
26 First, you need to figure out what you want to backup: configs ? multimedia ?
27 code ? For the purpose of this article, Let's say I want to backup all my
28 images, located in `/data/img`. Let's figure out the size of this directory:
29
30 ── du -sh /data/img
31 5.5G /data/img/
32
33 This could fit on my USB key. Let's mount and prepare it. In the meantime, we
34 will create a user dedicated to the backup process:
35
36 # useradd -M -g users
37 # mount /dev/sdd1 /mnt
38 # mkdir /mnt/backup
39 # chown backup:users /mnt/backup
40
41 Now the drive is ready to accept backups. Let's see how to create them.
42
43 ## Backing up
44
45 What's a backup already ?
46
47 > In information technology, a backup, or the process of backing up, refers to
48 > the copying and archiving of computer data so it may be used to restore the
49 > original after a data loss event. The verb form is to back up in two words,
50 > whereas the noun is backup.
51
52 **RECOVER**, that's the only word that matter. A backup is useless if you can't
53 recover data from it. PERIOD.
54
55 In my case, I chose `cpio`, because I find it simple to recover data from a cpio
56 archive. We'll see later how to do so. If you find it [easier to do with
57 tar](http://xkcd.com/1168/), feel free to adapt the following to your likings.
58
59 So what's the plan ? First, we'll create an archive containing all the files we
60 want. Then, compress the said archive to gain some space, and finally, manage
61 those backups to keep multiple copies.
62
63 ### Archiving
64
65 For this task, I chose `cpio`, which takes filenames on stdin, and creates an
66 archive to stdout. The fact it outputs to stdout give the ability to compress
67 the archive while it's created. A good thing with it is that it will only use
68 512 bytes of RAM ! Indeed, when you pipe data through a pipe, it will only pass
69 512 bytes at a time, then wait for the data to be processed, and so on... YOu
70 can check your pipe buffer with `ulimit -a`. Anyways:
71
72 ── find /data/img -type f | cpio -o | gzip -c > /mnt/backup/images.cpio.gz
73
74 And the archive is created and compressed ! Pretty easy isn't it ? Let's see how
75 to manage them now.
76
77 ### Managing
78
79 Be creative for this part ! you can either use `$(date +%Y-%m-%d)` as a name for
80 the backup, write a crawler to change names based on their timestamp, or maybe
81 use some rotating script, like the one written by
82 [ypnose](http://ywstd.fr/blog/2014/backup-snippet.html).
83
84 I modified the script to allow an automatic rotation of files, in case the file
85 number limit is reached. Here it is:
86
87 #!/bin/sh
88 #
89 # z3bra - (c) wtfpl 2014
90 # Backup a file, and rotate backups : file.0.BAK - file.1.BAK, ...
91 #
92 # Based on a original idea from Ypnose. Thanks mate !
93 # <http://ywstd.fr/blog/2014/bakup-snippet.html>
94
95 EXT=${EXT:-BAK} # extension used for backup
96 LIM=${LIM:-9} # maximum number of version to keep
97 PAD=${PAD:-0} # number to start with
98
99 usage() {
100 cat <<EOF
101 usage: `basename $0` [-hrv] <file>
102 -h : print this help
103 -r : perform a rotation if \$LIM is reached
104 -v : verbose mode
105 EOF
106 }
107
108 # report action performed in verbose mode
109 log() {
110 # do not log anything if not in $VERBOSE mode
111 test -z $VERBOSE && return
112
113 echo "[$(date +%Y-%m-%d)] - $*"
114 }
115
116 # rotate backups to leave moar room
117 rotate() {
118 # do not rotate if the rotate flags wasn't provided
119 test -z $ROTATE && return
120
121 # delete the oldest backup
122 rm ${FILE}.${PAD}.${EXT}
123
124 # move every file down one place
125 for N1 in `seq $PAD $LIM`; do
126 N2=$(( N1 + ROTATE ))
127
128 # don't go any further
129 test -f ${FILE}.${N2}.${EXT} || return
130
131 # move file down $ROTATE place
132 log "${FILE}.${N2}.${EXT} -> ${FILE}.${N1}.${EXT}"
133 mv ${FILE}.${N2}.${EXT} ${FILE}.${N1}.${EXT}
134 done
135 }
136
137 # actually archive files
138 archive() {
139 # test the presence of each version, and create one that doesn't exists
140 for N in `seq $PAD $LIM`; do
141 if test ! -f ${FILE}.${N}.${EXT}; then
142
143 # cope the file under it's new name
144 log "Created: ${FILE}.${N}.${EXT}"
145 cp ${FILE} ${FILE}.${N}.${EXT}
146
147 exit 0
148 fi
149 done
150 }
151
152 while getopts "hrv" opt; do
153 case $opt in
154 h) usage; exit 0 ;;
155 r) ROTATE=1 ;;
156 v) VERBOSE=1 ;;
157 *) usage; exit 1 ;;
158 esac
159 done
160
161 shift $((OPTIND - 1))
162
163 test $# -lt 1 && usage && exit 1
164
165 FILE=$1
166
167 # in case limit is reach, remove the oldest backup
168 test -f ${FILE}.${LIM}.${EXT} && rotate
169
170 # if rotation wasn't performed, we'll not archive anything
171 test -f ${FILE}.${LIM}.${EXT} || archive
172
173 echo "Limit of $LIM .$EXT files reached run with -r to force rotation"
174 exit 1
175
176 Now, to "archive" a file, all you need to do is :
177
178 ── cd /mnt/backup
179 ── backup.sh -r images.cpio.gz
180
181 And it will create the following tree:
182
183 ── ls /mnt/backup
184 images.cpio.gz images.cpio.gz.3.BAK images.cpio.gz.7.BAK
185 images.cpio.gz.0.BAK images.cpio.gz.4.BAK images.cpio.gz.8.BAK
186 images.cpio.gz.1.BAK images.cpio.gz.5.BAK images.cpio.gz.9.BAK
187 images.cpio.gz.2.BAK images.cpio.gz.6.BAK
188
189 Aaaaaand we're done ! Wrap it all in a crontab, and the backup process will
190 start:
191
192 # start a backup a 2 am, everyday
193 0 2 * * * find /data/img -type f |cpio -o |gzip > /mnt/backup/image.cpio.gz
194
195 # rotate backups limiting their number to 7 (a whole week)
196 0 3 * * * cd /mnt/backup && LIM=6 backup.sh -r image.cpio.gz
197
198 Should be enough for now. But here comes the most important part...
199
200 ## Restoring
201
202 This is the most important one, but not the trickiest, don't worry. We're on
203 friday, and your friends are arriving in a few minutes to see the photos from
204 your last trip. Before they arrive, you decide to cleanup the directory, and
205 notice a `.filedb-47874947392` created by your camera in the said directory.
206 Let's remove it:
207
208 ── cd /data/img/2014/trip_to_sahara/
209 ── ls -a .filedb-*
210 .filedb-47874947392
211 ── rm -f .filedb- *
212 rm: can't remove '.filedb-': No such file or directory
213 ── ls -la .
214 total 0
215 drwxr-xr-x 1 z3bra users 402 Sep 24 00:41 .
216 drwxr-xr-x 1 z3bra users 402 Sep 24 00:41 ..
217 -rw-r--r-- 1 z3bra users 0 Sep 24 00:58 .filedb-47874947392
218
219 <q>Oh god.. Why..?</q>
220 This shitty space between the '-' and the '\*' in your `rm` command is going to
221 fuck your presentation up !
222 Hopefully, you made a backup this morning at 2 am... Let's restore your whole
223 directory from it:
224
225 ── mount /dev/sdd1 /mnt
226 ── cd /mnt/backup
227 ── ls -la
228 total 0
229 drwxr-xr-x 1 z3bra users 402 Sep 10 00:41 .
230 drwxr-xr-x 1 z3bra users 402 Sep 10 00:41 ..
231 -rw-r--r-- 1 z3bra users 0 Sep 19 02:01 images.cpio.gz
232 -rw-r--r-- 1 z3bra users 0 Sep 15 03:00 images.cpio.gz.0.BAK
233 -rw-r--r-- 1 z3bra users 0 Sep 16 03:00 images.cpio.gz.1.BAK
234 -rw-r--r-- 1 z3bra users 0 Sep 17 03:00 images.cpio.gz.2.BAK
235 -rw-r--r-- 1 z3bra users 0 Sep 18 03:00 images.cpio.gz.3.BAK
236 -rw-r--r-- 1 z3bra users 0 Sep 19 03:00 images.cpio.gz.4.BAK
237 -rw-r--r-- 1 z3bra users 0 Sep 13 03:00 images.cpio.gz.5.BAK
238 -rw-r--r-- 1 z3bra users 0 Sep 14 03:00 images.cpio.gz.6.BAK
239
240 We are friday 19 september. As you can see from the timestamp, backups number
241 5/6 are from last week. The backup from this morning is the number 4, and the
242 latest is the one without any number.
243
244 `cpio` allow extracting files from an archive using the following syntax
245
246 ── cpio -i -d < archive.cpio
247
248 `-i` ask for an extraction, while `-d` tells `cpio` to recreate the directory
249 tree if it does not exists. Check the [wikipedia
250 article](http://wikipedia.org/cpio) for more explanations on how it works.
251
252 So, to restore our lost directory you'd proceed like this:
253
254 # archive was created from absolute path, and cpio restor files from current
255 # directory, so let's move to root, to restore files directly
256 ── cd /
257
258 # you can pass globbing patterns to cpio, so that it only restores what you
259 # want. Don't forget to decompress the archive first
260 ── gzip -cd /mnt/backup/images.cpio.gz | cpio -ivd data/img/2014/trip_to_sahara/*
261 data/img/2014/trip_to_sahara/IMG-0001.JPG
262 data/img/2014/trip_to_sahara/IMG-0002.JPG
263 data/img/2014/trip_to_sahara/IMG-0003.JPG
264 data/img/2014/trip_to_sahara/IMG-0004.JPG
265 data/img/2014/trip_to_sahara/IMG-0005.JPG
266 data/img/2014/trip_to_sahara/IMG-0006.JPG
267 data/img/2014/trip_to_sahara/.filedb-47874947392
268 23 blocks
269
270 ── ls /data/img/2014/trip_to_sahara
271 IMG-0001.JPG IMG-0003.JPG IMG-0005.JPG
272 IMG-0002.JPG IMG-0004.JPG IMG-0006.JPG
273
274 # be careful this time !
275 ── rm /data/img/2014/trip_to_sahara/.filedb-47874947392
276
277 And it's all good ! Don't forget to keep your drive safe, and duplicate it if
278 you can, just in case.
279
280 Hope it will be useful to someone, cheers !