Shell
“If you hold a UNIX shell to your ear, you might just hear the C.”
Introduction
This text assumes that you’ve written shell scripts before, or at least used the interactive shell, as part of NSWI177 Introduction to Linux. We don’t discuss elementary things such as syntax and instead focus on practical problems encountered when writing real-world shell scripts.
If you need a refresher of the basics, take a look at An Introduction to the Unix Shell.
Terminology
Fun fact: the name “shell” is a pun: in analogy with a nut, it’s what stands between you (the user) and the kernel of the operating system. In other words, to get to the kernel, you need to crack the shell!
Depending on the context, “shell” can have one of several closely-related meanings:
- a shell interpreter such as dash, Zsh or Bash;
- the language shell scripts are written in;
- a shell process (e.g., PID 12345 executing
/bin/zsh
).
Shell, the interpreter
Just like Python scripts are interpreted with the Python interpreter, shell scripts are interpreted with a shell interpreter, usually called just “the shell”. There is, however, a key difference:
Python (3.x) is a single well-defined language, and there’s a canonical interpreter for it (the official one). But there are many popular shell interpreters, such as:
- Bourne-again shell (Bash)
- Z shell (Zsh)
- Debian Almquist shell (Dash)
They differ from one another in the:
- input language they accept (lexically and syntactically);
- semantics of shell programs;
- set of supported features.
Some of the differences are fundamental (one shell lacks a feature another shell supports) and sometimes very subtle (the features are nearly identical, save for one edge case). Consequently, shell scripts that work perfectly under one interpreter can misbehave under another.
The reason for this is largely historical; while Python is a modern language which appeared in 1991, the original UNIX shell, the Thompson shell, appeared in 1971. It was improved upon, extended and modified, giving rise to many slightly incompatible variations which inspired the modern shells such as Zsh and Bash which we use today.
Shell, the language
The various shell interpreters give rise to various shell dialects. You might say a script is “written in Bash” if you want to emphasize that it relies on features specific to Bash.
Thankfully, most of the shells used in practice today agree upon a set of core concepts. The POSIX family of standards describes a standard for the Shell Command Language. In terms of feature support, in a Venn diagram:
Shell script portability
To make matters worse, shell scripts are rarely self-contained: to achieve even the simplest tasks, they often call upon external programs.
For example, to search for files you’d use find(1), a binary program usually
found in /usr/bin/find
. Without these programs, called commands, the
shell wouldn’t be very useful.
For a given command, there are often multiple implementations available:
-
coreutils and busybox are two implementations of many of the foundational UNIX commands such as ls(1), cp(1) or rm(1). While coreutils aims to be complete and feature-rich, busybox aims for the smallest possible binary size and is pretty much bare-bones.
-
GNU netcat vs BSD netcat are two implementations nc(1), a network diagnostic utility. Looking at the linked man pages, it’s easy to see that the options are different and incompatible. On most systems, either the GNU or the BSD version is available with the command name nc.
-
The commands evolve over time: features are added, and sometimes deprecated or removed. For example, git clone supports the
--shallow-since
option since v2.11.0. Whether your shell scripts can use this option depends on the version of git installed in the system.
This rises concerns about the portability of shell scripts. A script is said to be portable when it can be used, unmodified, on a wide range of systems. That can be quite challenging to achieve, given how tied shell scripts are to their environment.
While writing a portable shell script can be very difficult, it’s fairly easy to avoid many mistakes which make shell scripts non-portable for no good reason. In a nutshell:
- Use
/bin/sh
as your shell interpreter. On most systems, this will either be a dedicated POSIX shell (e.g. Dash) or Bash in POSIX compatibility mode (see--posix
in bash(1)). This will help you only rely on those shell features which are broadly available and fairly well-defined1.
-
Use ShellCheck. When you specify
/bin/sh
as your interpreter, it will also warn you when you rely on non-POSIX features or behavior, in addition to spotting many common scripting errors. -
Don’t rely on exotic, deprecated, undocumented and very recently added features of the commands you use. Or, if you really want to use them, check for their presence (for example, by looking at the output of
git --version
) and implement a fallback.
This way, you get reasonable portability with very little effort.
Does portability matter in practice?
In short: yes. You can easily run into portability issues with:
-
Containers, since they often run bare-bones Linux distros such as Alpine Linux which come with Dash and busybox by default to conserve space and memory.
-
Other popular Unix-like systems, mainly Macs, which are stuck on Bash 3.2.57 (released 2006!) due to licensing constraints and the BSD variants of many utilities such as sed(1) or cut(1).
Following the portability suggestions above will save you a lot of trouble.
Choose your battles
The most important lesson when it comes to shell scripting is to choose your battles. Just don’t write shell scripts when C or Python are a better fit.
Shell is a great fit when your program can be expressed as a sequential combination of other programs. In other words, when the shell doesn’t do much besides serially executing other programs and plumbing them together.
Shell is not a good fit when:
- the program needs to perform lots of parallel processing;
- sophisticated error handling is required (shell scripts usually abort on any error);
- the program needs to perform math or very complex text or data processing; or
- performance matters.
While you can often work around the above limitations, if you need any of the above, it’s usually a good idea to write your program in a high-level programming language from the get-go instead. On the other hand: it’s not unheard of for a 20 line shell script to be replaced by an equivalent program spanning 500 lines of C.
Using the right tool for the job is arguably one of the most important skills not just when it comes to shell scripting, but in programming overall.
High-level design of shell scripts
Let’s look at the high-level structure of example shell programs first. Please note that this is just one way to structure your scripts, albeit a recommended one.
Example #1: gitdo
1 #!/bin/sh
2 set -eu
3
4 usage() {
5 cat <<EOF
6 Usage: gitdo [-gv] [--] COMMAND... [--]
7 gitdo -h
8
9 Execute COMMAND on all tracked files in a Git repository.
10 The file names will be provided as command line arguments to COMMAND.
11
12 Options:
13 -g Operate on the entire repository, not just the current subtree
14 -h Print this message and exit
15 -v Be verbose
16 EOF
17 }
18
19 opt_global=false
20
21 while getopts "ghv" opt; do
22 case $opt in
23 g) opt_global=true;;
24 d) echo="echo";;
25 h) usage; exit 0;;
26 v) set -x;;
27 *) usage 2>&1; exit 1;;
28 esac
29 done
30
31 shift $((OPTIND-1))
32
33 root=$(git rev-parse --show-toplevel 2>&1) || {
34 printf "gitdo: not within a Git repo: %s\n" "$PWD"
35 exit 1
36 } >&2
37
38 dir=.
39 $opt_global && dir=$root
40
41 git ls-files -z --exclude-standard -- "$dir" |
42 xargs -0r "$@"
Since this is the first example, let’s break it down and discuss it piece by piece.
Hashbang
1 #!/bin/sh
This line is usually called a hashbang, a shebang or an interpreter
directive. When we execute the script (e.g., ./gitdo
), it’s not the script
that gets executed; rather, the interpreter (/bin/sh
) is executed and is given
the name of the script to interpret (./gitdo
) as the first argument.
Perhaps surprisingly, this logic is implemented in the kernel directly. When you execute the script, the name of the script is passed as an argument to execve(2). The kernel then needs to decide what type of executable the named file is (whether it’s an ELF binary, a script or some other type of supported executable file format).
When the file starts with #!
, it’s treated as a script. If you’re interested in
the details, here are some pointers: fs/exec.c, fs/binfmt_script.c.
By specifying /bin/sh
as your interpreter, you clearly ask for a POSIX shell.
If your script does in fact require Bash or Zsh to function properly, you should
of course specify that shell instead.
From Python, you may be used to the following instead:
#!/usr/bin/env python3
There are two reasons to use /usr/bin/env in the hashbang:
- The path to the interpreter (python3) must be absolute, but it’s not the same
on all systems where the script is run. Since
env
performs$PATH
lookup, it will resolve the command name at runtime, just like the shell does.
-
You want to allow the user to override the system-wide Python interpreter. For example, when
PATH=~/bin:/bin:/usr/bin
and there is an executable called python3 in ~/bin/python3, this interpreter will be used.
This form (chain-loading through env) isn’t normally needed with shell scripts, because the path is usually the same everywhere (/bin/sh) and users don’t really bring their own shell. That is, unless you use a Mac with an ancient Bash—then you probably want to override the system one with a newer version from Homebrew or MacPorts.
set -eu
This is arguably the most important line in the entire script:
2 set -eu
By default, shell error handling is extremely benevolent. Notably:
-
When a command fails, evaluation continues with the next command. In other words, the interpreter doesn’t stop on errors by default.
-
Referencing an unassigned variable is not considered an error. Instead, the variable expands to the empty string.
Especially when combined with other peculiarities of the shell, this makes for a lethal cocktail. Consider the following trivial script:
#!/bin/bash
cd $dir
rm -rf *
This script has more problems than lines of code:
-
It targets Bash for no good reason. This is all too common. Shell ≠ Bash. Use
#!/bin/sh
where POSIX is good enough. -
If
dir
is unset,$dir
expands to the empty string. Becauseset -u
isn’t in effect, this isn’t an error. Because the expansion isn’t quoted, cd will be executed without arguments. Without arguments, cd will change to$HOME
. -
Consequently, if
dir
is unset, you will delete the contents of your$HOME
rather than the contents of$dir
. Not great. -
If
dir
is set to/some/path
and/some/path
does not exist, the cd will fail and the current working directory will remain unchanged. But becauseset -e
is not in effect, execution will continue and the rm will delete the contents of the current working directory. Also not great. -
The invocation of rm is missing the
--
option terminator. If there’s a file whose name starts with-
, it will be understood as an option to rm (and may or may not be deleted itself, depending on what the option does).
Here’s a better version:
#!/bin/sh
set -eu
cd "$dir"
rm -rf -- *
If there’s one thing to remember, it’s always to set -eu
.
usage()
The so-called usage message is a tiny bit of documentation embedded in the source code of a (UNIX) program. Think of it as of a mini man page which just lists the valid ways to invoke the script and a list of options:
4 usage() {
5 cat <<EOF
6 Usage: gitdo [-gv] [--] COMMAND... [--]
7 gitdo -h
8
9 Execute COMMAND on all tracked files in a Git repository.
10 The file names will be provided as command line arguments to COMMAND.
11
12 Options:
13 -g Operate on the entire repository, not just the current subtree
14 -h Print this message and exit
15 -v Be verbose
16 EOF
17 }
The <<EOF
marks the beginning of a so-called here-document. The document
starts on the next logical line (line 6) and stops just before a line
containing only the end marker EOF
(line 16). The end marker is arbitrary shell
word, but EOF
(“end of file”) is often used. The content of the document is then
supplied to cat as stdin. In other words, lines 5 through 16 just print the
usage message to stdout.
A well-written usage message is very short yet informative.
getopts
The first real work the script does is option processing. By far the simplest way to process shell options is getopts, a POSIX shell utility:
19 opt_global=false
20
21 while getopts "ghv" opt; do
22 case $opt in
23 g) opt_global=true;;
24 d) echo="echo";;
25 h) usage; exit 0;;
26 v) set -x;;
27 *) usage 2>&1; exit 1;;
28 esac
29 done
30
31 shift $((OPTIND-1))
A well-behaved UNIX program prints usage to stdout with -h
and then
immediately exits with a zero (success) exit code. It also prints usage when
invoked incorrectly, but then the usage message should go to stderr and the
exit code should be non-zero (failure).
The *) case item is taken when an unknown option is specified, or when an option is missing a value; getopts will print the error to stderr and the usage will follow:
% gitdo -x
Illegal option -x
Usage: gitdo [-ghv] -- COMMAND...
Execute COMMAND on all (non-excluded) files within a Git repository.
The file names will be provided as command line arguments to COMMAND.
Options:
-g Operate on the entire repository, not just the current subtree.
-h Print this usage message and exit.
-v Be verbose
% echo $?
1
It’s good practice to also support -v
(verbose) which induces the script
to log what it’s doing. Using set -x
is a quick-and-dirty way to implement
a basic verbose mode. It’s a good idea to support this option from the very
beginning, since then it won’t be accidentally used for anything else.
This script’s command line interface is very simple as it takes only one
another option (-d
) which accepts no value; and no processing of positional
arguments takes place. We’ll see some more complex examples later.
Once options are processed, we shift the arguments so that $1
becomes the first
non-option (positional) argument.
The actual program
Up until now, we have only dealt with the script’s interface. Now it’s time to implement whatever it is that the script does, and that’s a relatively small portion of the source code:
33 root=$(git rev-parse --show-toplevel 2>&1) || {
34 printf "gitdo: not within a Git repo: %s\n" "$PWD"
35 exit 1
36 } >&2
37
38 dir=.
39 $opt_global && dir=$root
40
41 git ls-files -z --exclude-standard -- "$dir" |
42 xargs -0r "$@"
This is very straight-forward. The only non-POSIX part (apart from git which is
non-negotiable) is the -z
option we pass to xargs(1). It’s however fairly widely
supported and makes the script more robust, so we consider it a fair trade.
The script also has just the right number of lines.
Example #2: snapback
The second example is a simple wrapper around Snap, a Btrfs snapshot manager. It allows you to quickly recover a particular version of a file.
1 #!/bin/sh
2 set -eu
3
4 opt_profile=root
5 opt_recover=false
6
7 usage() {
8 cat <<EOF
9 Usage: snapback [-v] [-p PROFILE] [FILE]
10 snapback -h
11
12 List all Snap backups of FILE for PROFILE.
13 If FILE is not given, list backups of all files in PROFILE.
14 If PROFILE is not set, it defaults to $opt_profile.
15
16 Options:
17
18 -h Print this message and exit
19 -p Snap profile to search backups in [$opt_profile]
20 -r Recover the most recent backup of FILE
21 -v set -x
22 EOF
23 }
24
25 while getopts "hp:rv" opt; do
26 case $opt in
27 h) usage; exit;;
28 p) opt_profile=$OPTARG;;
29 r) opt_recover=true;;
30 v) set -x;;
31 *) usage >&2; exit 1;;
32 esac
33 done
34 shift $((OPTIND-1))
35
36 file=$PWD
37 [ $# -gt 1 ] && {
38 printf >&2 "too many arguments\n"
39 usage >&2
40 exit 1
41 }
42 [ $# -eq 1 ] && {
43 file=$1
44 shift
45 }
46
47 files=$(snap -L "$file" -- "$opt_profile")
48 printf "%s\n" "$files"
49
50 "$opt_recover" && {
51 latest=$(printf "%s\n" "$files" \
52 | tail -n1 \
53 | cut -f4)
54 cp "$latest" .
55 }
Of note:
- The high-level structure of the program is exactly the same.
- The
-p
option accepts an argument via$OPTARG
. - The script accepts positional arguments and drops usage on incorrect use.
Example #3: passman
The third example is a very simple password manager.
1 #!/bin/sh
2 set -eu
3
4 store="$HOME/passwords"
5
6 usage() {
7 cat <<EOF
8 Usage: passman [-v] [-s STORE]
9 passman [-v] [-s STORE] -i [-cR] CREDENTIAL
10 passman [-v] [-s STORE] -r [-cR] CREDENTIAL
11 passman [-v] [-s STORE] -o [-c] CREDENTIAL
12 passman [-v] [-s STORE] -d CREDENTIAL
13 passman [-h]
14
15 Without arguments, list all passwords in STORE.
16 With one or more arguments, modify the contents of the store.
17
18 Options:
19 -i Insert CREDENTIAL
20 -r Replace CREDENTIAL
21 -o Output CREDENTIAL
22 -d Delete CREDENTIAL
23 -c Use clipboard for input (-ic, -rc) and output (-oc)
24 -R Use a random value for input (-iR, -rR)
25 -h Print this message and exit
26 -s DIR Use password store DIR [$store]
27 -v set -x
28 EOF
29 }
30
31 opt_clipboard=false
32 opt_delete=false
33 opt_insert=false
34 opt_output=false
35 opt_random=false
36 opt_replace=false
37
38 while getopts cdhliorRs:v opt; do
39 case $opt in
40 c) opt_clipboard=true;;
41 d) opt_delete=true;;
42 h) usage; exit;;
43 i) opt_insert=true;;
44 o) opt_output=true;;
45 r) opt_replace=true;;
46 R) opt_random=true;;
47 s) store=$OPTARG;;
48 v) set -x;;
49 *) usage >&2; exit 1;;
50 esac
51 done
52 shift $((OPTIND - 1))
53
54 err() {
55 exitcode=$1; shift
56 fmt="passman: $1"; shift
57 printf >&2 -- "$fmt" "$@"
58 exit "$exitcode"
59 }
60
61 [ -d "$store" ] ||
62 err 1 "Password store %s is not a directory.\n" "$store"
63
64 git -C "$store" rev-parse --show-toplevel >/dev/null 2>&1 ||
65 err 1 "Password store %s is not a Git repository.\n" "$store"
66
67 "$opt_insert" || "$opt_replace" || "$opt_output" || "$opt_delete" || {
68 [ $# -eq 0 ] || {
69 printf >&2 "Unexpected argument(s): %s...\n" "$1"
70 usage >&2;
71 exit 4
72 }
73 find "$store" -name '*.gpg' -printf "%P\n" | sed 's/\.gpg$//' | sort
74 exit
75 }
76
77 [ $# -eq 1 ] || {
78 printf "Missing CREDENTIAL\n"
79 usage
80 exit 1
81 } >&2
82
83 cred=$1; shift
84 cred_file="$store/$cred.gpg"
85
86 cleanup() {
87 rm -f -- "$cred_file.tmp"
88 }
89
90 trap cleanup EXIT
91
92 [ -f "$cred_file" ] || "$opt_insert" ||
93 err 2 "Credential %s does not exist in store %s.\n" "$cred" "$store"
94
95 [ -f "$cred_file" ] && "$opt_insert" &&
96 err 2 "Credential %s already exists in store %s.\n" "$cred" "$store"
97
98 "$opt_insert" || "$opt_replace" && {
99 cred_dir=$(dirname "$cred_file")
100 mkdir -p -- "$cred_dir"
101
102 recipients_file=$(git -C "$cred_dir" rev-parse --show-toplevel)/.recipients
103 [ -f "$recipients_file" ] ||
104 err 2 "missing recipients file %s\n" "$recipients_file"
105 recipients=$(xargs -n1 printf "-r %s\n" <"$recipients_file")
106
107 if "$opt_random"; then
108 head -c 24 /dev/random | base64
109 elif "$opt_clipboard"; then
110 xclip -r -o -selection clipboard 2>/dev/null ||
111 err 3 "Cannot copy from clipboard."
112 else
113 cat
114 fi | gpg -q $recipients --encrypt --armor >"$cred_file.tmp"
115 mv "$cred_file.tmp" "$cred_file"
116 git -C "$cred_dir" add "$cred_file"
117 "$opt_insert" &&
118 msg="Add credential $cred" ||
119 msg="Replace credential $cred"
120 git -C "$cred_dir" commit -q -m "$msg"
121 }
122
123 "$opt_output" && {
124 gpg -q --decrypt "$cred_file" 2>/dev/null |
125 if "$opt_clipboard"; then
126 xclip -r -i -l 1 -selection clipboard
127 else
128 cat
129 fi
130 }
131
132 "$opt_delete" && {
133 cred_dir=$(dirname "$cred_file")
134 rm -- "$cred_file"
135 git -C "$cred_dir" add "$cred_file"
136 git -C "$cred_dir" commit -q -m "Delete credential $cred"
137 find "$store" -type d -not -path "$store/.git/*" -empty -delete
138 }
139
140 exit 0
Of note:
- The high-level structure of the program is exactly the same.
- The usage message lists the various invocations and relevant options.
- The options don’t conflict; you can insert and output, output and delete etc. a credential with a single invocation.
Example #4: deadlink
1 #!/bin/sh
2 set -eu
3
4 usage() {
5 cat <<EOF
6 Usage: deadlink [-v] [FILE...]
7 deadlink -h
8
9 Parse each FILE as HTML and check all outgoing links.
10 When no FILE is given or FILE is -, check standard input.
11
12 Options:
13
14 -h Print this message and exit
15 -v set -x
16 EOF
17 }
18
19 while getopts "hv" opt; do
20 case $opt in
21 h) usage; exit;;
22 v) set -x;;
23 *) usage >&2; exit 1;;
24 esac
25 done
26
27 shift $((OPTIND-1))
28
29 trap 'rm -f links' EXIT INT QUIT TERM
30
31 py=$(cat <<'EOF'
32 import bs4
33 import sys
34 soup = bs4.BeautifulSoup(sys.stdin.read(), features="html.parser")
35 for a in soup.find_all("a"):
36 print(a["href"])
37 EOF
38 )
39
40 cat "$@" | python -c "$py" | grep >links -E '^https?://'
41
42 exit=0
43 while read -r link; do
44 curl -sSf >/dev/null -- "$link" && {
45 printf "OK: %s\n" "$link"
46 } || {
47 [ $? -ge 128 ] && {
48 printf "Check interrupted\n"
49 break
50 }
51 printf "BAD LINK: %s\n" "$link"
52 exit=1
53 } >&2
54 done <links
55 exit "$exit"
Of note:
-
This script started as a one-time hack to see if there are any dead links on this very page. As it seemed useful, I have refactored it into a proper script following the best practice documented above.
-
The script uses BeautifulSoup (a Python HTML processing library) to extract links from the page. Using regular expression for that purpose would be unnecessarily fragile.
-
To make it easy to stop the script, if the exit from curl(1) is above 128, we break the loop, as we know that the curl process must have been killed with a signal (some signal, not necessarily SIGKILL).
Shell idioms and best practice
We’ve seen some examples of recommended high-level design of shell programs, now let’s take a look at useful primitives. We try to highlight some lesser-known features which make shell scripting a whole lot more enjoyable.
Everything in this section is standard POSIX behavior and tools, unless otherwise noted.
printf
You probably use echo(1) to print strings in the interactive shell, as in
% echo $some_var
And that’s perfectly fine.
However, echo has no place in shell scripts due to portability concerns. Take a look at this page describing various echo implementations. The TL;DR is that echo is only good for plain text containing no escape sequences.
Instead, use printf(1). This is modeled after printf(3):
% name=world
% printf "Hello %s!\n" "$name"
Hello world!
Besides being well-defined, it’s useful for alignment of variable-width strings. For example, to right-justify a string to a width of 10 characters:
% printf "%10s\n" "foo"
foo
You can also left-justify by using a negative width. The width can also be variable, as in:
% width=-11
% name=world
% printf "Hello %*s!\n" "$width" "$name"
Hello world !
Numbers can be converted and aligned, too:
% printf "0x%08x\n" 42
0x0000002a
And so on. Refer to printf(3) for details.
Compound command redirections
A little-known feature, redirection can be applied to compound statements, such
as if-clauses or for-clauses or { ... }
:
if [ $# -gt 0 ]; then
printf "Unexpected arguments: %s...\n" "$1"
usage
fi >&2
Both the output of printf and the usage message will be redirected to stderr. Redirections can also be applied to braces, even in function definitions:
err() {
exit=$1; shift
msg=$1; shift
printf "foo.sh: error: %s\n" "$msg"
exit "$exit"
} >&2
This makes the output of err
go to stderr by default when invoked.
Note: when redirecting a brace group, the result is not equivalent to redirecting the individual commands, because the redirection is only performed once. Thus the following:
{ cat; cat; } </etc/hostname
is not equivalent to:
{ cat </etc/hostname; cat </etc/hostname; }
Short-circuit evaluation
Instead of writing
if "$cond"; then
cmd
fi
you can use the short-circuit evaluation operator &&
(and):
"$cond" && cmd
For example, we could rewrite one of the prior examples as:
[ $# -gt 0 ] || {
printf "Unexpected arguments: %s...\n" "$1"
usage
} >&2
This saves typing and makes the code a bit easier to read. The ||
(or)
operator works similarly to express !cond
.
Parameter expansion
Parameter expansion is a shell word of the form $param
, where param
is a
parameter name. For example, $foo
or $1
are parameter expansions. The full
syntax is ${param}
and the braces are usually omitted.
Note: the braces are required in two cases:
- to refer to argument
$
n where n > 9, e.g.${10}
; - when the next character could be mistaken for part of the identifier, e.g. in
foo${bar}baz
.
However, the braces permit additional processing of the parameter.
Conditional parameter expansion operators
The syntax ${param op [word]} allows you to expand param conditionally depending on op. In all cases below, if word is not provided, it defaults to null (the empty string):
- ${param-[word]}: if param is set, expand to param; otherwise expand to word;
- ${param=[word]}: if param is set, expand to param; otherwise assign param to word and expand to word;
- ${param?[word]}: if param is set, expand to param; otherwise print an error and exit (use word as the error message if provided);
- ${param+[word]}: if param is set, expand to word, otherwise expand to null.
These operators can each be prefixed with a colon (:- := :? :+
) and the
condition then changes from “if set” to “if set and not null”.
This makes it simple to expand shell variables with a fallback value:
nc -l -p "${port:-8000}"
Conventionally, you would set defaults for environment variables at the beginning of a shell script, e.g.:
#!/bin/sh
set -eu
: "${LC_ALL:=en_US.UTF-8}"
The colon (:
) is a so-called null utility. It does nothing useful2. But
without it, the first field resulting from the parameter expansion would be
taken as a command name, which is not what we want.
String operators
Yes, the shell supports string operators! They are few but very useful nonetheless:
The first operator is string length operator. The syntax is ${#
param}
:
% name="Ken Thompson"
% printf "The length of \$name is %d\n" "${#name}"
The length of $name is 12
Next, there is a remove smallest suffix pattern operator with syntax ${
param%
suffix}
:
% file=img.jpg
% basename=${file%.*}
% printf "%s\n" "$basename"
img
As you can see, the suffix can be a pattern; the pattern matching notation is the same as the one used for filename expansion.
There is also a remove smallest prefix pattern operator with syntax
${
param#
prefix}
.
Both the prefix and suffix operators can be doubled (## %%); these are the remove largest prefix pattern and remove largest suffix pattern, respectively:
% pathname=/path/to/some/file
% basename=${pathname##*/}
% printf "%s\n" "$basename"
file
The argument array $@
The POSIX shell supports exactly one array, the argument array $@
. The values
of this array are accessed through $1
, $2
, $
n, where n is the length of the
array (n = $#
). $0
is a special parameter and not part of $@
.
Crucially, the expansion "$@"
(mind the double quotes) is equivalent to
"$1"
"$2"
… "$
n"
. In other words, expansion of "$@"
produces quoted
fields, one field per item.
Initially, this array is set to the positional arguments from the invocation of the shell itself, or of the called shell function. This is best illustrated with the following script:
#!/bin/sh
set -eu
for arg in "$@"; do
printf "Arg = %s\n" "$arg"
done
Running this script, we get
./foo.sh a b c
Arg = a
Arg = b
Arg = c
When iterating over "$@"
, the in clause (in "$@"
) can be omitted (for
arg;
do
… done
).
Setting array elements
The set shell built-in can be used to set the argument array $@
and the
corresponding parameters $1
, $2
… $
n. The arguments to set become the new $@
.
-
To clear arguments:
set --
-
To append
$
var to$@
:set -- "$@" "$var"
-
To prepend
$
var to$@
:set -- "$var" "$@"
Here’s a simple example:
#!/bin/sh
set -eu
set --
set -- "$@" 3
set -- "$@" 4
set -- 2 "$@"
set -- 1 "$@"
for arg; do
printf "Arg = %d\n" "$arg"
done
Running this, we get:
Arg = 1
Arg = 2
Arg = 3
Arg = 4
Shifting the array
It’s also possible to shift the array to the left, removing first n elements,
with shift. When you shift
[n], where n defaults to 1, the argument
$
i refers to what argument ${
i+n}
referred to before.
#!/bin/sh
set -eu
set 1 2 3 4
shift 2
for arg; do
printf "Arg = %d\n" "$arg"
done
Running this, we get:
Arg = 3
Arg = 4
Transforming the array
Using set, shift and for, it’s possible to implement filter and map, too. For example, to only keep non-negative elements, one could write:
#!/bin/sh
set -eu
for arg; do
shift
[ "$arg" -ge 0 ] && set "$@" "$arg"
done
Recall that for
arg is equivalent to for
arg in
"$@"
. It’s legal to
modify the array in the loop with set and shift, because the expansion of
"$@"
(even if it’s implicit) happens before the body of the for loop is
executed. The shift removes each argument, and the set appends it to the end of
the array only if it’s -ge
zero. Since the body of the for loop executes exactly
once per element of the original array, we are left with non-negative entries
only.
Similarly, one could map over the array; this is left as an exercise to the reader.
Argument pass-through
Sometimes, your script will accept a variable number of arguments, and some or all of them will be passed to another program, like in the gitdo example:
41 git ls-files -z --exclude-standard -- "$dir" |
42 xargs -0r "$@"
This is where "$@"
truly shines: it allows you to pass the arguments
correctly without having to worry about quoting.
Naming positional arguments
It’s good practice to give names to the positional arguments of your shell
scripts. That is, rather than using $1
, $2
, … directly:
mkdir -p "$1"
some_cmd -o "$1" "$2"
It’s better to name your arguments and use the names:
output_dir=$1
source_file=$2
mkdir -p "$output_dir"
some_cmd -o "$output_dir" "$source_file"
It’s even better to shift out the positional arguments as you process them:
output_dir=$1; shift
source_file=$1; shift
mkdir -p "$output_dir"
some_cmd -o "$output_dir" "$source_file"
Advantages:
-
You can change the order of the script’s arguments simply by swapping the assignments, without having to edit the indices.
-
$@
contains exactly the unnamed options. Often, there is a variable number of positional arguments (such as filenames to act upon) passed on with"$@"
as in the gitdo example above. -
You can easily check that there are no extraneous or missing options:
[ $# -eq 0 ] || { printf "Unexpected arguments: %s...\n" "$1" exit 1 } >&2
This applies to functions, too.
readonly
It is possible to mark variables read-only with the readonly built-in:
readonly answer=42
This prevents the variable from being changed and unset, and is a useful protection for constants.
Omitting quotes
Sometimes, quotes can (and should, for the sake of professionalism) be omitted:
-
In assignments of the form
var=
word, where word is any shell word. Thevar=
word is called an assignment word in the shell lexical grammar. Quotes can be omitted even when word expands to a string containing IFS characters:var='has spaces' var2=$var
However! The following situation is completely different:
export var2="$var"
Quotes are required here, because the
var2="$var"
is not an assignment word: assignment words are only recognized before the command name (here that’s theexport
word) but not after. Consequently, the usual field splitting rules apply. -
Around case expressions:
case $var in ... esac
Note: If in doubt, quote.
Suppressing errors with ||:
With set
-e
in effect, any error will take your script down. Sometimes,
that’s not what you want. For example:
num=$(printf "%s\n" "$var" | grep -Eo '[0-9]+')
When var
contains no digits, grep(1) will exit with a non-zero exit code.
Consequently, the exit code of the assignment is non-zero, taking down the
shell. In a situation like this, it’s often not an issue that the grep matched
nothing. To suppress the error, use:
num=$(printf "%s\n" "$var" | grep -Eo '[0-9]+' ||:)
That’s just a contraction of ||
and the :
and the missing space is simply a
stylistic choice. This makes the exit code of the pipeline the exit code of :
which is always 0.
Beware: of unexpected subshells
Consider the following example:
#!/bin/sh
set -eu
seq 10 | while read -r num; do
sum=$((sum+num))
done
printf "%d\n" "$sum"
When executed under:
- Zsh, this script prints 55
- Bash, this script prints 0
- Dash, this script prints 0
Why? Well, this has to do with POSIX rules for pipe execution. The standard says:
Additionally, each command of a multi-command pipeline is in a subshell environment; as an extension, however, any or all commands in a pipeline may be executed in the current environment.
Zsh apparently executes the while loop in the current shell execution
environment, thus the variable sum
is the same variable in the while loop as
the rest of the script. Unlike Bash and Dash which execute the loop in a
separate execution environment with its own set of variables. You can verify this
easily by printing partial sums in the body of the loop: the partial results
are all correct, but the result is sometimes 0.
Beware: of the exit code of a pipeline
The exit code of a pipeline is the exit code of the last command in the pipeline, thus:
grep '^root:' /etc/paswd | cut -d: -f7
Exits with 0 despite the typo, and that is often undesirable. Unfortunately, the following doesn’t work as expected:
{ grep '^root:' /etc/paswd || exit 1; } | cut -d: -f7
(It should be clear why this doesn’t work. See the prior section.)
The only portable way to avoid this behavior (where it matters) is to split up the pipeline into multiple stages:
filtered=$(grep '^root:' /etc/paswd)
printf "%s\n" "$filtered" | cut -d: -f7
A temporary file could also be used instead of the variable:
grep >filtered '^root:' /etc/paswd
cut <filtered -d: -f7
Don’t forget to remove the file on exit.
There is a non-standard option, called pipefail, supported by both Bash and Zsh:
#!/bin/zsh
set -eu -o pipefail
false | true
The exit code of this program is 1 as desired.
Beware: of unexpected variable scope
The default scope of shell variables is global:
#!/bin/sh
set -eu
f() {
x=42
}
x=0
f
printf "%d\n" "$x"
This script outputs 42 and often, that’s not what you want. There’s a non-POSIX extension called local which allows you to tie the scope of a variable to the run-time of a function:
#!/bin/sh
set -eu
f() {
local x=42
}
x=0
f
printf "%d\n" "$x"
This script outputs 0.
Note: While non-standard, the local extension is widely supported (at least in Zsh, Bash and Dash).
Beware: of assuming a particular working directory
Often, scripts are written in a particular working directory in mind. Typically, the programmer assumes that the current working directory is going to be the directory containing the script, and for a long time, the script is only used (and consequently, tested) as:
./script.sh
Everything works until somebody tries to execute it from somewhere else:
./bin/script.sh
This alters the meaning of relative paths in the script. Therefore, if you reference other files in your script, such as other (sourced) scripts, always anchor the path relative to your script:
dir=$(realpath "$(dirname "$0")")
. "$dir/common.sh.inc"
Sometimes, it’s advantageous to temporarily change the current working directory, for example:
(
cd "$repo_dir"
git ls-files
)
The change of current working directory is then local to the subshell (
… )
.
Sometimes, tools provide options that allow you to achieve the same effect a change of working directory would have, as is the case with the git invocation above. We could thus write:
git -C "$repo_dir" ls-files
See git(1) for details.
Beware: of eval
The eval built-in makes it possible to evaluate a string. This is sometimes useful. For example, to refer to a variable whose name is the value of another variable:
#!/bin/sh
set -eu
check_variable_matches_re() {
local var_name=$1; shift
local re=$1; shift
eval "local val=\${$var_name}"
printf "%s\n" "$val" | grep -Eq "$re" || {
printf "Variable %s does not match regular expression: %s\n" \
"$var_name" "$re"
return 1
}
}
foo=42
check_variable_matches_re foo '^[0-9]+$'
bar=answer42
check_variable_matches_re bar '^[a-z]+$'
Running this, we get:
Variable bar does not match regular expression: ^[a-z]+$
This is all fine as long as the name of the variable is a hard-coded constant. But once the name of the variable is user-supplied, very bad things can happen very quickly:
user_input='lsdjadfalhfda=$(touch /hahaha)'
# later...
check_variable_matches_re "$user_input" '^$'
It is thus advisable to avoid eval if at all possible!
There are non-standard extensions to obtain the value of a variable whose name is stored in another variable var:
If you absolutely need this feature, targeting a non-POSIX shell is probably preferable to using eval.
Beware: of limitations of set -e
Sometimes, not even set -e will prevent nasty things from happening. For example, the following command:
export "FOO=$(cmd)"
is fundamentally different from:
FOO=$(cmd)
export FOO
We already explained why the quotes are only required in the first form. But there is another difference which has to do with error handling:
-
In the first form, the exit code of the command is the exit code of export, not of the expansion
$(cmd)
. Thus, ifcmd
fails, the export will still succeed and the execution will continue. -
In the second form, the exit code of the assignment is the exit code of the expansion, thus if
cmd
fails, the script will abort due to set -e.
Therefore, the second form is preferable whenever the right-hand side of the assignment contains expansions which may fail.
Beware: of passing secrets on the command line
Sometimes, tools provide a -p
option to provide a password, or some other
option with the same intent (providing a secret string to the program).
This is fundamentally wrong. If you come across this, do not use it:
The program’s command line, including options and their values, is visible as
/proc/
pid/cmdline
and displayed as the output of tools such as ps(1),
top(1), htop(1), etc. Mounting procfs with hidepid=2
(see proc(5)) solves
this, but it still won’t stop systemctl(1) from happily spitting out the full
command line.
So, no. The only correct way to provide a secret to a program is via stdin.
Even if you trust your machine and don’t mind the secret being (temporarily?) visible in the process listing, you still need to take care when constructing command lines containing secrets, as the commands are recorded in your shell’s history file. By convention, shells won’t record any commands beginning with whitespace in the history file, so prefixing such command with a single space might do the trick.
Powerful tools
There are several incredibly powerful command-line tools you should probably know:
-
curl(1) is a Swiss-army knife utility supporting several network protocols, most notably HTTP. It allows you to make arbitrary HTTP requests from the command line:
curl \ -sSf \ -X POST \ -H "Content-Type: application/json" \ -d @file \ --url-query key=value \ "https://nswi106.cz/some/path"
See curl(1) for a lot of additional options.
-
jq(1) is a command-line utility and a scripting language for JSON processing. JSON is of course served by most web APIs, but did you know that even command-line tools sometimes provide JSON output?
blkdev_size_gib=$(lsblk --bytes --json \ | jq -r '[.blockdevices[].children[].size] | add / (1024*1024*1024)')
Also, jq can be used to construct JSON as input to other programs, such as curl, allowing you to perform complex API calls from the command line.
Take a look a the jq tutorial and jq(1).
-
parallel(1) is a utility for task parallelization. It is rather difficult to master, but the performance gains may well be worth it. For example, to convert many JPEG files in parallel:
mkdir -p small ls *.jpg | parallel convert {} -size 800 small/{}
Take a look at parallel(1) and parallel_tutorial(7).
Further reading
- After writing this chapter, I have found Insufficiently known POSIX shell features, a blog post showcasing many of the same lesser-known POSIX features. It might be interesting if you want another take on the same topic.
Missing bits
Some bits are still missing and will be added in future revisions of this document. Let us know if you want to contribute any of these:
- Interactive vs non-interactive shell use
- (*) Shell internals
Acknowledgements
- Tomáš Volf provided extensive feedback and several corrections
Thanks!