On  2017-08-06, I wrote about that incident in our data center. I also
  wrote, "fuck performance, I want data integrity".

  I still think that way. However, my  tool  that  stores  checksums  in
  extended  attributes in the file system, was a failure. Because it was
  too slow.

  The tool itself isn't slow. After you run it, you  have  a  filesystem
  where  *every*  file  has extended attributes. That in itself is not a
  problem, either. Here's  the  thing:  *Other*  tools  might  get  slow
  because of that. rsync, for example.

  I create my backups using "rsync -zaXAP ...". That "X" copies extended
  attributes and if you have a lot of them, rsync  gets  really,  really
  slow.

  Isn't  that a contradiction? "Fuck performance" vs. "that's too slow"?
  It might seem that  way,  but  remember  that  my  tool  was  a  dirty
  workaround  to  begin  with.  It's  not a clean filesystem design that
  incorporates checksums. It's a hack. And I'm not willing to  sacrifice
  performance for a dirty hack.

  There's  an  alternative:  Store the checksums in regular files. Or in
  one big regular file, like an sqlite database. I already have  a  tool
  that does that (it's not published). Downside of this approach is that
  you can't detect file renames, because the checksums are not  directly
  attached  to  the  files. On the other hand, it allows you to scan for
  files that have vanished ...

  Of course, this is a workaround as well. What I want is  a  filesystem
  that handles all this.


                           ____________________


  Oh,  yes,  and there's Git. A lot of my private data that I care about
  is in Git. Git has checksums all over the place and that helps a lot.

  It does not help in the data center, though.