https://blog.benjojo.co.uk/post/lto-tape-backups-for-linux-nerds < All postsTwitterRSS GitHub Jan 27 2022 LTO Tape data storage for Linux nerds The insides of a LTO-5 tape Tape storage is surprisingly not dead! If you are here then you may be considering using LTO tape as part of your backup or your long term archiving strategy. I'm here to mostly talk you out of it, or at best make sure that you are aware of what you are walking into. Is it actually cheaper to use Tape or Disk? One of the common reasons to look towards LTO Tape is that it's much cheaper than hard drives, where a 12TB SATA drive costs around PS18.00 per TB (at the time of writing), a LTO-8 tape that has the same capacity costs around PS7.40 per TB (at the time of writing). That's a significant price difference. So, you may ask what is the catch? The drive cost LTO has different generations. Around every 3 years the LTO Consortium will unveil a new version. This generally comes with a capacity upgrade of around double the last generation. Tape cartridges themselves are not forward compatible, but drives are generally backwards compatible to write for one version before and read for two versions before. This means that if you buy an LTO-6 drive, you should expect to be able to also read/write LTO-5 tapes, but only read LTO-4 ones. If you want to buy a factory new LTO-8 drive (the newest generation that is readily available at the time of writing) then you are looking at around PS3,000. You can often find drives much cheaper on eBay if you can tolerate used drives (often 1000's cheaper). Drives do have wear and tear (we will get to that later) so they are not perfectly the same product. LTO drives, however, degrade much slower than a standard hard disk would. To illustrate the cost per TB, here is a calculator that you can plug in your own numbers (or use mine at the time of writing) to figure out if tape is actually cheaper for you. (Assuming you are okay with giving up "instant" access to data) Only numbers are valid in these boxes LTO Data (LTO-7 Default): Cost of LTO Tape Drive: [2000 ] Cost of LTO Tape: [50 ] Uncompressed LTO Tape Capacity (TB): [6 ] HDD Data: Size of HDD (TB): [14 ] Cost per Disk: [206 ] Capacity of HDD Chassis (Slots): [16 ] Cost of HDD Chassis: [1000 ] LTO Drive downsides The drives are noisy, meaning if you are planning to work next to your LTO drive you should be prepared for noise and vibration. Here is an example of my LTO-6 drive: No HTML5? Come on! Tape cartridges physically stack reasonably well, and if you are buying enough tapes your supplier can barcode them for you to easily identify them from each other. If you are planning to use your tapes for long term archiving tapes supposedly have a 30 year life (obviously no LTO tape has been around 30 years to verify this claim), but only if you keep them in their preferred temperature and humidity. The packaging on the tape will generally state what this sweet spot is, My LTO-6 tapes suggest 16C to 25C at 20% - 50% RH. When buying tapes, if your generation matches with the drive (or within the backward compatibility of the drive as mentioned before) you should be fine. The tape brand and the drive brand do not have to match. While the write speeds of tape drives are generally quite fast (100MByte/s+ for LTO 4-6, 300MBytes/s+ for LTO 7-9) drives can slow in some directions as they age (or if the tape cartridge itself is getting older) In some cases when the drive is writing slower in one direction it will record less data. Meaning the amount of data you can write per tape cartridge might be reduced by the drive's age. This can be tested for when you have a drive, as this sort of degradation is generally detected as a drive failure. Used tapes? A box of used LTO5 tape If you are looking for cheap tapes, you may be able to find older LTO (4,5,6) generation tapes from cheap IT equipment recyclers for immensely cheap. I have a 150 stack of LTO-5 tapes that I bought for less than 200 GBP. Meaning that the cost per TB was immensely low (hovering around 1.10 GBP per TB). Not all used tapes are the same however! Depending on the competency of your recycler, the tape may be sold unusable. If the tape was magnetically wiped, the tapes will be useless no matter what. LTO tapes require special factory data recorded on them for alignment on the tape head. If the cartridge is magnetically erased, then the tape will be permanently useless. You can find out more on mass erasing LTO tapes here --------------------------------------------------------------------- I have been convinced anyway; I want to buy a LTO drive! Ok fine. Here is what you need to know if you want to buy a drive. For this example, we will be running on the assumption that you will be buying an LTO-5 or better drive. Since LTO-4 and lower have stranger formats and tools that I have no experience in. First. Drive type. You can generally buy 4 types of LTO drives: Types of LTO drive In general, I would always recommend that you go for an External SAS drive. These have the least amount of effort to get working. They have a C13 (same as most desktops) power input, and a SAS SFF-8088 (rolls off the tongue I know!) socket on the back. This can connect to a machine with a PCI-E SAS card (generally cheap) that will trivially auto detect it. Other options include the half height (will fit in a 5.25 inch slot normally used for an optical drive), that will have an internal SAS SFF-8482 connector on the back, It will look like it's a SATA port, but it is not. The final options in the diagram above are (as far as I understand) for autoloaders (better known as tape libraries). The full height units often come with SAS SFF-8482 connectors as well. While the sleeker (bottom of the image) ones most often come with Fibre Channel (FC) SFP connectors. This is because most of the tape libraries use FC as a transport between the machine putting data on the tape, and the actual physical drive itself. Fibre Channel cards go pretty cheap on the second hand market and I've covered some tricks with them before You may find that the non-external types need decent cold airflow to work correctly, this will not be an issue if you are integrating a drive into a server chassis or a dedicated tape chassis but might be a problem if you are installing it into a regular ATX chassis. Tape drive and cartridge health Since it seems most tape drives are very similar (if not the same??) to each other. Almost all of the tools to work with them are the same across brands, There are two holy tools I use for Linux debugging of tape drives and tape cartridges. You will want to download a copy of the IBM Tape Diagnostic Tool (ITDT) and the xTalk tool. I personally find that ITDT is great for checking if the drive is working right, and xTalk is great for dumping out information on drive stats and media health. Here is an example of xTalk's "Dump All Pages" output for drive health: Log Page 14 14 - Device Statistics Log Lifetime media loads: 1300 Lifetime cleaning operations: 41 Lifetime power on hours: 41194 Lifetime media {tape} motion hours: 9425 Lifetime meters of tape motion: 72926447 Media motion hours since last cleaning: 44 Media motion hours since second to last cleaning: 50 Media motion hours since third to last cleaning: 93 Lifetime power cycles: 51 Volume loads since last paramater reset: 4 Hard write errors: 0 Hard read errors: 0 Duty cycle sample time: 2888989 Read duty cycle: 3 Write duty cycle: 0 Activity duty cycle: 4 Volume not present duty cycle: 90 Ready duty cycle: 6 Drive manufacturer serial number: xxx Drive serial number: xxx Medium removal prevented: 0 Maximum recommended mechanism temperature exceeded: 0 When buying a used drive, the Lifetime media {tape} motion hours is a useful metric to gauge the wear on the drive head. xTalk can also dump data out of the RFID chip inside the cartridge that contains usage data. This data has things like how many times it's been put inside a drive, how many times the tape has passed over a drive head, and what the lifetime read/writes are on the cartridge itself. This is not too different to S.M.A.R.T data. LTO RFID Chip The xTalk output for a cartridge looks like this (for a tape that has mild issues): Log Page 17 17 - Volume Statistics Log Page Valid : 1 Thread Count : 26 Total data sets written : 1579318 Total write retries : 57 Total unrecovered write errors : 0 Total suspended writes : 2 Total fatal suspended writes : 0 Total data sets read : 1060455 Total read retries : 1171 Total unrecovered read errors : 3 Last mount unrecovered write errors : 0 Last mount unrecovered read errors : 0 Last mount megabytes written : 0 Last mount megabytes read : 1676 Lifetime megabytes written : 3904137 Lifetime megabytes read : 2621487 Last load write compression ratio : 0 Last load read compression ratio : 99 Medium mount time : 0 Medium ready time : 0 Total native capacity : 1520000 Total used native capacity : 337522 Volume serial number : MF0WU3YFJ4 Tape lot identifier : G5AA135D Volume barcode : A11952L5 Volume manufacturer : FUJIFILM Volume license code : U107 Volume personality : Ultrium-5 Write Protect : 0 WORM : 0 Maximum recommended tape path temperature exceeded: 0 BOM passes : 922 Middle of tape passes : 463 First encrypted logical object identifiers Partition 0 : FFFFFFFFFFFFh First unencrypted logical object on the EOP side of the first encrypted logical object identifier Partition 0 : FFFFFFFFFFFFh Approximate native capacity of partition Partition 0 : 1520000 Approximate used native capacity of partition Partition 0 : 337522 Approximate remaining native capacity to early warning of partitions Partition 0 : 1182484 Cleaning Drives do need cleaning from time to time, this can be as far as every 100 or so of "motion hours" or lower. Generally, I do it whenever the drive is above 50 hours of "motion hours" and the performance of the drive is questionable. Cleaning requires a special cartridge that costs about the same as a new tape, these tapes are compatible with all drives as far as I know. They are generally "good" for about 50 cleans. Compression and Encryption It is wise to encrypt your data going on to tape, since tape takes a very long time to erase (since you would have to write the whole tape) disposing of a tape can be risky even if it's slightly broken. Drives above LTO-4 have built-in hardware encryption, however I would steer away from using it and instead just encrypt data yourself (possibly with the tool I helped make called age!). Like most things, you should also consider compressing your data before encrypting and writing it to tape. LTO tape capacities are often quoted in their "compressed capacity" which is a little cheeky since it assumes basically over a 50% compression ratio, this is not at all likely to be true if you are writing video or other lossy mediums like images etc to the tape. I generally run my data through zstd to compress and then age to encrypt. Zstd and age are quite fast and I've not found them to impede performance noticeably. Actually writing data to the tape root@testtop:~# ls -alh /dev/tape/by-id/ total 0 drwxr-xr-x 2 root root 120 Dec 27 17:10 . drwxr-xr-x 4 root root 80 Dec 27 17:10 .. lrwxrwxrwx 1 root root 9 Dec 27 17:10 scsi-xxxxxx -> ../../st0 lrwxrwxrwx 1 root root 10 Dec 27 17:10 scsi-xxxxxx-nst -> ../../nst0 Tapes show up in Linux as two block devices, /dev/st0 and /dev/nst0 (last number depending on how many tape drives the system has detected). Unless you are writing one huge (IE: the whole tape) thing at once, you will want to use the /dev/nst0 device as it will not automatically rewind the tape when the program that is writing data releases the file descriptor. Unlike most block devices these are devices that do not enjoy seeking of any kind. So you generally end up writing streaming file formats to tape, unsurprisingly this is exactly what the Tape ARchive (.tar) is actually for. If you are unable to use tar for whatever reason and really need something that looks like a file system, there is LTFS. I have never attempted to use LTFS myself, and would likely only really attempt it if I was running on Windows. Need for speed Having faster than 1GBit/s networking is useful for this. If you have the ability to have cheap 10GBit/s Ethernet (even if it's point to point), it might be worth it. Writing out a full tape can take quite a long time, even if you are writing at full speed. As LTO capacity has gone up, the write speed has not caught up with it. Meaning an LTO-5 tape takes around 3 hours to write, but an LTO-8 tape takes a whopping 9 hours! Block size woes Another issue to keep in mind with tape is keeping the drive well fed with large blocks of data. While most standard disks have block sizes of 512 bytes or 4096 bytes, tapes enjoy a much larger block size of 512KB or higher. In addition to this a drive (and the tape inside it) take more wear and tear if they are to stop and stall. So running your backups through mbuffer to buffer a section of your data into RAM before writing out to tape is a good idea to "smooth out" the time gaps (even if short) in data not being delivered to the drive. Later generation drives are able to deal with slower input rates by physically slowing down how fast they are moving the tape. However I generally try and avoid this happening and instead mbuffer a lot of data (6GB) and write out as much as it can at full speed. Here is what I use (to buffer incoming network data): ncat -l -p 1337 | mbuffer -P 80 -m 6G -s 524288 -o /dev/nst0 It is worth pointing out that if you are writing to tape at large block sizes you may find you are unable to read the tape device with some cryptic Cannot allocate memory error. This is because the program that is reading from the block device is not reading it with a big enough buffer. You can work around this by using dd and setting bs=512k and piping the output into the program desired. If you are streaming more than just one thing to a tape, you will want to write a EOF to the tape. This allows you to read out a whole "file" from the block device (think dd) in one go, and then when you are done with that section, the program will exit out cleanly, and you can start the program again (assuming you are using /dev/nst0) and you will get the next section. You can do this using the mt command with mt -f /dev/nst0 weof 1. root@testtop:~# echo "Hello world!" > /dev/nst0 root@testtop:~# mt -f /dev/nst0 weof 1 root@testtop:~# dmesg > /dev/nst0 root@testtop:~# mt -f /dev/nst0 weof 1 root@testtop:~# mt -f /dev/nst0 rewind root@testtop:~# dd if=/dev/nst0 bs=1524288 status=progress > FileA 0+1 records in 0+1 records out 13 bytes copied, 0.0106532 s, 1.2 kB/s root@testtop:~# dd if=/dev/nst0 bs=1524288 status=progress > FileB 0+0 records in 0+0 records out 0 bytes copied, 0.00441687 s, 0.0 kB/s root@testtop:~# dd if=/dev/nst0 bs=1524288 status=progress > FileB 0+15 records in 0+15 records out 61257 bytes (61 kB, 60 KiB) copied, 0.0415833 s, 1.5 MB/s root@testtop:~# head FileA Hello world! root@testtop:~# head FileB [ 0.000000] Linux version 5.7.8-benjojo (root@airmail) (gcc version 8.3.0 (Debian 8.3.0-6), GNU ld (GNU Binutils for Debian) 2.31.1) #1 SMP Wed Jul 15 20:29:31 BST 2020 ... Once you are done with a tape in the system you can request the tape drive to rewind and eject the tape by running mt -f /dev/nst0 offline --------------------------------------------------------------------- If you want to stay up to date with the blog you can use the RSS feed or you can follow me on Twitter Until next time! Related Posts: Imaging mounted disk volumes under duress (2021) Writing userspace USB drivers for abandoned devices (2019) Random Post: A peek into the USM format (2015)