On 2017-08-06, I wrote about that incident in our data center. I also wrote, "fuck performance, I want data integrity". I still think that way. However, my tool that stores checksums in extended attributes in the file system, was a failure. Because it was too slow. The tool itself isn't slow. After you run it, you have a filesystem where *every* file has extended attributes. That in itself is not a problem, either. Here's the thing: *Other* tools might get slow because of that. rsync, for example. I create my backups using "rsync -zaXAP ...". That "X" copies extended attributes and if you have a lot of them, rsync gets really, really slow. Isn't that a contradiction? "Fuck performance" vs. "that's too slow"? It might seem that way, but remember that my tool was a dirty workaround to begin with. It's not a clean filesystem design that incorporates checksums. It's a hack. And I'm not willing to sacrifice performance for a dirty hack. There's an alternative: Store the checksums in regular files. Or in one big regular file, like an sqlite database. I already have a tool that does that (it's not published). Downside of this approach is that you can't detect file renames, because the checksums are not directly attached to the files. On the other hand, it allows you to scan for files that have vanished ... Of course, this is a workaround as well. What I want is a filesystem that handles all this. ____________________ Oh, yes, and there's Git. A lot of my private data that I care about is in Git. Git has checksums all over the place and that helps a lot. It does not help in the data center, though.