dedup is a simple data deduplication program.

dedup only handles a single file at a time, so using tar is advised.
For example, to dedup a tar file you can invoke dedup as follows:

    tar -c ~/dir | dedup -r ~/bak -m "$(date)"

This will create .{cache,snapshots,store} files in the ~/bak
directory.  The store file contains all the unique blocks.  The
snapshots file contains all the revisions of files that have been
deduplicated.  Each revision is identified by its SHA256 hash.  The
cache file is only used to speed up block comparison.

To list all known revisions run:

    dedup -r ~/bak -l

You will get a list of hashes.  Each hash corresponds to a single file
(in this case, a tar archive).

To extract a file from the deduplicated store run:

    dedup -r ~/bak -e <hash> > dir.tar

Cheers,
sin
