Skip to main content

Large file synchronization

Efficiently synchronize copies of a large sparse file locally. I deal with a large amount of large sparse files because of virtualization and other technologies. Because of their size, often a small number of blocks have data and, of these, a small number of blocks are changed and need to be backed up. Using a log-based (snapshotting) file system on USB 2 as a backup device, I only want to write blocks if absolutely necessary.

So what's the solution?  Some simple custom code that

  1. checks that both file sizes are identical;
  2. verifies that some metadata has changed (i.e time stamp, permissions or owner/group);
  3. reads both files block-by-block;
  4. writes only changed blocks to the destination file, and
  5. updates any changed metadata.
Classifications

Apache2 fails to start on boot (but subsequent manual starts work fine)

I migrated a virtual machine running the Apache 2 web server software on Debian wheezy GNU/Linux from Bytemark's legacy virtual machine infrastructure to their new BigV infrastructure. Staging this on my internal virtual infrastructure worked without a hitch. However, on the true infrastructure, Apache failed to start on boot with the following message. It would restart without issue if I then logged on and started it manually.

Classifications

Concurrency and synchronization in POSIX Bourne Shell (sh or bash)

It seems like ages ago now that I found my customer had a process that connected to hundreds of Oracle databases to run predefined SQL for health checks. These databases were hosted all over the world and the SQL could take up to fifteen minutes to complete for a single database (with huge amounts of TNS timeouts). The end result was a CSV file that was ultimately formatted into a spreadsheet to provide management information. It took about a day to obtain this final result.

I thought there was a better way.

Classifications