GNU/Linux | www.ewan.cc

Large file synchronization

By Ewan

September 15, 2013

Efficiently synchronize copies of a large sparse file locally. I deal with a large amount of large sparse files because of virtualization and other technologies. Because of their size, often a small number of blocks have data and, of these, a small number of blocks are changed and need to be backed up. Using a log-based (snapshotting) file system on USB 2 as a backup device, I only want to write blocks if absolutely necessary.

So what's the solution? Some simple custom code that

checks that both file sizes are identical;
verifies that some metadata has changed (i.e time stamp, permissions or owner/group);
reads both files block-by-block;
writes only changed blocks to the destination file, and
updates any changed metadata.

Classifications

GNU/Linux

Apache2 fails to start on boot (but subsequent manual starts work fine)

By Ewan

February 25, 2015

I migrated a virtual machine running the Apache 2 web server software on Debian wheezy GNU/Linux from Bytemark's legacy virtual machine infrastructure to their new BigV infrastructure. Staging this on my internal virtual infrastructure worked without a hitch. However, on the true infrastructure, Apache failed to start on boot with the following message. It would restart without issue if I then logged on and started it manually.

Classifications

GNU/Linux

MapReduce introduction (Linux Journal)

By Ewan

June 7, 2013

Linux Journal has published a simple to understand introduction to MapReduce using Hadoop. Definitely worth a read if you need an introduction.

Classifications

GNU/Linux

Big Data

Concurrency and synchronization in POSIX Bourne Shell (sh or bash)

By Ewan

March 24, 2013

It seems like ages ago now that I found my customer had a process that connected to hundreds of Oracle databases to run predefined SQL for health checks. These databases were hosted all over the world and the SQL could take up to fifteen minutes to complete for a single database (with huge amounts of TNS timeouts). The end result was a CSV file that was ultimately formatted into a spreadsheet to provide management information. It took about a day to obtain this final result.

I thought there was a better way.

Classifications

GNU/Linux

Shell

Netbook operating system choice

By Ewan

December 14, 2008

The Economist has an interesting article on the decisions you need to make when choosing a netbook computer. It is suggested that you use the bundled tuned operating system based on the Linux kernel.

Classifications

GNU/Linux

Mobile

Microsoft Windows

Subscribe to GNU/Linux