It seems like ages ago now that I found my customer had a process that connected to hundreds of Oracle databases to run predefined SQL for health checks. These databases were hosted all over the world and the SQL could take up to fifteen minutes to complete for a single database (with huge amounts of TNS timeouts). The end result was a CSV file that was ultimately formatted into a spreadsheet to provide management information. It took about a day to obtain this final result.
I thought there was a better way.
Surely the time spent waiting for a slow or distant database could be used for running commands on another database? My solution was to partition the database list into eight groups and process these groups concurrently. Now I just needed some synchronisation as the password system was not thread safe. Looking for a method of using a mutex in a shell script I stumbled across this article amongst others.
Putting this together, here is an example script that creates multiple 'threads', each incrementing a protected counter variable.
#!/bin/sh
#
# Use a mutex to serialize access to a variable. Test this by using multiple
# processes mutating this variable. Synchronize all subprocesses and exit after
# a defined number of iterations.
THREADS=20
COUNTFILE=counter
RUNTIME=5
mutex_acquire() {
# $1 = mutex name (default: lock)
# $2 = miss sleep time in seconds (default: 1)
if [ "$1" = "" ]; then lock=.lock; else lock=.$1.lock; fi
if [ "$2" = "" ]; then sleep=0.01s; else sleep=$2; fi
locked=false
while [ $locked = "false" ]; do
mkdir $lock 2>/dev/null
rc=$?
if [ $rc -ne 0 ]; then sleep $sleep
else locked=true; fi
done
}
mutex_release() {
# $1 = mutex name (default: lock)
if [ "$1" = "" ]; then lock=.lock; else lock=.$1.lock; fi
rmdir $lock
}
process() {
echo T$1 starting
count=1
while [ $count -le $RUNTIME ]; do
mutex_acquire
if [ ! -f $COUNTFILE ]; then echo 1 >$COUNTFILE
else
countval=`cat $COUNTFILE`
echo `expr $countval + 1` >$COUNTFILE
fi
echo T$1 counter=`cat $COUNTFILE`
mutex_release
count=`expr $count + 1`
done
echo T$1 exiting
}
thread=1
while [ $thread -le $THREADS ]; do
process $thread &
thread=`expr $thread + 1`
done
wait
mutex_acquire
rm $COUNTFILE
mutex_releaseMore recently I read an article on Linux Journal that discussed using signals to limit the processing time of multiple processes in a shell script. This was provoking, but digging deeper it was apparent that there were many race conditions that had not been solved (e.g. killing processes that have already died could actually kill a different process). Turning this on its head, surely it is better to have the top level shell control the execution of the subshells rather than the other way around?
Putting this together, the earlier script is enhanced to exit after a set amount of time.
#!/bin/sh
#
# Use a mutex to serialize access to a variable. Test this by using multiple
# processes mutating this variable. Synchronize all subprocesses and exit after
# a specified period of time.
THREADS=20
COUNTFILE=counter
RUNTIME=5
mutex_acquire() {
# $1 = mutex name (default: lock)
# $2 = miss sleep time in seconds (default: 1)
if [ "$1" = "" ]; then lock=.lock; else lock=.$1.lock; fi
if [ "$2" = "" ]; then sleep=0.01s; else sleep=$2; fi
locked=false
while [ $locked = "false" ]; do
mkdir $lock 2>/dev/null
rc=$?
if [ $rc -ne 0 ]; then sleep $sleep
else locked=true; fi
done
}
mutex_release() {
# $1 = mutex name (default: lock)
if [ "$1" = "" ]; then lock=.lock; else lock=.$1.lock; fi
rmdir $lock
}
process_start() {
thread=$1
trap 'process_stop $thread' ALRM
echo T$thread starting
count=1
while [ 0 -eq 0 ]; do
mutex_acquire
if [ ! -f $COUNTFILE ]; then echo 1 >$COUNTFILE
else
countval=`cat $COUNTFILE`
echo `expr $countval + 1` >$COUNTFILE
fi
echo T$1 counter=`cat $COUNTFILE`
mutex_release
count=`expr $count + 1`
done
}
process_stop() {
echo T$1 exiting
exit 0
}
thread=1
while [ $thread -le $THREADS ]; do
process_start $thread &
subpids="$! $subpids"
thread=`expr $thread + 1`
done
sleep $RUNTIME
mutex_acquire
kill -ALRM $subpids
wait
mutex_release
mutex_acquire
rm $COUNTFILE
mutex_releaseNote that the above scripts were developed using dash and tested using bash.