Skip to main content

The C programming language.

Large file synchronization

Efficiently synchronize copies of a large sparse file locally. I deal with a large amount of large sparse files because of virtualization and other technologies. Because of their size, often a small number of blocks have data and, of these, a small number of blocks are changed and need to be backed up. Using a log-based (snapshotting) file system on USB 2 as a backup device, I only want to write blocks if absolutely necessary.

So what's the solution?  Some simple custom code that

  1. checks that both file sizes are identical;
  2. verifies that some metadata has changed (i.e time stamp, permissions or owner/group);
  3. reads both files block-by-block;
  4. writes only changed blocks to the destination file, and
  5. updates any changed metadata.
Classifications