Dev tricks: Asynchronous job control in bash scripts

Run multiple background jobs in parallel with a few handy builtins and operators

Tags: Dev tricks


In my personal projects, I like to do as much as I can from scratch while keeping my dependencies light. This forces me to learn how things work under the hood and saves me the pain of seeking out new libraries every time one that I rely on is abandoned or superceded. One of the skills I’ve picked up thanks to this habit is shell scripting.

I like writing shell scripts. They may feel arcane compared to modern scripting languages, what with their legible syntaxes and seemingly infinite public package repositories. Yet, sometimes I find it more convenient to call a few CLI commands in a bash script than to parse through a tangled web of plugins. As a plus, each time I write one of these scripts I learn at least one new bash feature or syntax. Let me show you an example of what I mean.

The problem

For a recent project, I wanted to be able to stand up a local testing environment with a single command. The three steps of the bootstrapping process were, in no particular order:

  • Start up the Nginx proxy server to listen on a localhost port
  • Start up the Starlette API server to listen on a localhost port
  • Run a custom bash script to watch for changes to static site pages and copy updated files to the proxy server’s site directory.

The industry-standard approach to this problem involves selecting an all-in-one framework like NextJS, which would provide the dev server and continuous file recompilation, combined that a Docker container to run the API server. Both of those would take more time to set up than I cared to invest. And just think of how bloated the dependency graph for this tiny project would become!

I know how to execute each of these three steps in the terminal already, so I ought to be able to write a bash script that does this for me, right?

The solution

The formula I needed combines the “wait” builtin with a couple bits of bash syntax that I was unaware of: “&” and “$!”. Let’s break down each part.

Running tasks in the background

It’s a half-truth to say I didn’t know about “&”. I use it all the time in the terminal to start tasks in the background. In the past, I’d tried to write scripts using “&”, but I couldn’t get it to work correctly. Here’s what I was doing:

#INCORRECT! "&;" doesn't mean anything!
command_to_run_in_background.sh &;

#...next command

What I learned this time is that “;” and “&” are actually siblings! They belong to a set of special tokens called “control operators”. The Bash Manual page for lists of commands has this to say about &:

If a command is terminated by the control operator ‘&’, the shell executes the command asynchronously in a subshell. This is known as executing the command in the background, and these are referred to as asynchronous commands. The shell does not wait for the command to finish, and the return status is 0 (true). When job control is not active (see Job Control), the standard input for asynchronous commands, in the absence of any explicit redirections, is redirected from /dev/null.

All I really needed to do was remove the “;” and let “&” terminate the command, like this:

command_to_run_in_background.sh &

#You can even write multiple statements like this:

first_bg_command & second_bg_command &

Capturing process IDs

Every running process has a process ID (PID) that you can reference. Most commonly, this PID can be used to kill a hanging or misbehaving process, but it has other uses too! In a script, we can’t rely on “ps” to find the PID, but we can capture the process using the special parameter “$!”, which contains the PID of the last executed command. Let’s check the Bash Manual page on special parameters and find “$!”:

Expands to the process ID of the job most recently placed into the background, whether executed as an asynchronous command or using the bg builtin.

Now we can easily get the PID of a command like this:

command.sh &
COMMAND_PID=$!

Waiting on multiple PIDs

Here’s the fun part. Using the “wait” bash builtin, we can keep our script in the foreground until the processes we’ve started exits. The Bash Manual page on job control builtins describes how “wait” works. The documentation is a bit verbose, so I’ll just skip to the syntax. It’s quite simple:

wait $COMMAND_ONE_PID $COMMAND_TWO_PID;

Trapping signals

One problem I ran into was waiting on a long-running shell script that I had written. No matter how many times I attempted to “CTRL+C”, the script would keep running. I realized I needed to use “trap”, another builtin command , to teach my script how to handle incoming signals. Once again, the syntax is pretty simple:

trap 'exit 0;' SIGHUP SIGINT SIGKILL

The first argument passed to “trap” is the statement or statements to execute when a signal is trapped, and the subsequent arguments are the signals that should be trapped. You can use multiple “trap” statements to vary the behavior based on specific signals.

Putting it all together

Here is a paraphrased version of the dev server start-up script I wrote:

#!/usr/bin/bash

export PROJECT_ROOT;

$PROJECT_ROOT/api_server/bin/dev.sh &
API_PID=$!

$PROJECT_ROOT/static_site/bin/dev.sh &
STATIC_SITE_PID=$!

$PROJECT_ROOT/proxy_server/bin/dev.sh &
PROXY_PID=$!

wait $API_PID $STATIC_SITE_PID $PROXY_PID;

And here are the contents of “$PROJECT_ROOT/static_site/bin/dev.sh”:

#!/usr/bin/bash

STATIC_SITE_ROOT=$PROJECT_ROOT/static_site

#Exit on common interrupt signals

trap 'exit 0;' SIGHUP SIGINT SIGKILL

#Implement a rudimentary file watcher using diff and async
while true; do
    # If any of the static files have changed
    if ! diff -qr $STATIC_SITE_ROOT/src /srv/static_site; then
        # Sync the static files with the directory served by the proxy dev server
        rsync -vr --del $STATIC_SITE_ROOT/src/ /srv/static_site/;
    fi
    # don't abuse the CPU
    sleep 1;
done

Worth the “wait”

No doubt, there are easier and more maintainable ways to orchestrate multi-step builds and development environments. But the time I took to figure this out gave me some great insight into the semantics of bash that might well come in handy in the future.


Have questions or comments about this blog post? You can share your thoughts with me via email at blog@matthewcardarelli.com , or you can join the conversation on LinkedIn .