The shell supports defining functions, which, as we learned in the previous post, you should embrace and use. Unfortunately, they are fairly primitive and their use can, paradoxically, introduce other readability problems.
One specific problem is that function parameters are numbered, not named, so the risk of cryptic code is high. Let’s see why this is a problem.
Tell me what this function does and how to invoke it:
fetch() {
local rflag=
[ -z "${3}" ] || rflag="-r${3}"
mkdir -p "${4}"
( cd "${4}" && cvs -d"${1}" -q checkout -P ${rflag} "${2}" )
}
As the fetch
function name says, it fetches something, and as the content says, it does so from CVS. All good, right? But how do you invoke this function? With some effort, you can see that the function takes up to four parameters, but what is each one meant to be? Will you get them in the right order?
Sadly, the style—or lack thereof—above plagues all shell code out there, giving the shell a worse reputation than it already has.
You could slightly improve the snippet above with a comment that documented each parameter, but the problem is that comments easily get out of sync. You are better off avoiding them. (Docstrings are a different story which I may save for another day, but there is no such thing as docstrings for the shell.)
A nicer solution is to make the parameter names part of the code. Take a look at this idiom:
fetch() {
local cvsroot="${1}"; shift
local module="${1}"; shift
local tag="${1}"; shift
local dir="${1}"; shift
local rflag=
[ -z "${tag}" ] || rflag="-r${tag}"
mkdir -p "${dir}"
( cd "${dir}" && cvs -d"${cvsroot}" -q checkout -P ${rflag} "${module}" )
}
The four local variable definitions at the beginning of the function exist purely to assign names to the parameters. With that trick, the function’s prototype is obvious: you immediately know there are four parameters, you know in which order to pass them, and you can reasonably guess their purpose based on their names. Just like in a “real” programming language.
A few caveats though, because every time I suggest this idiom in a code review, people “tweak” it in harmful ways:
Use
shift
instead of numbering the parameters. The reason is two-fold. First, usingshift
combined withset -e
makes function calls fail if they are not given enough parameters. And second, the order of the statements corresponds exactly to the order in which parameters are expected: there is no room for mistake. A potential code edit that reorders those lines would cause the code to fail instead of becoming more confusing but continuing to work.Keep the
shift
call on the same line as the variable definition. Avoid the temptation to break them apart. Theshift
calls on their own lines introduce unnecessary vertical noise. Due to the noise, it becomes more tempting to avoid theshift
and number the parameters, which I already said wasn’t a good idea.Use
${@}
to refer to an unknown number of parameters. The only safe way to handle a variable number of parameters is to expand them via"${@}"
—but assigning that expansion to a variable isn’t possible. In particular, do not writelocal args="${*}"
: just keep those arguments unnamed and use"${@}"
where necessary to refer to them. (Sure, if you already depend on Bash, you can use an unportable array to store these… but think twice before adding a Bash dependency.)
The code above is a simplified and edited version of the shtk_cvs_checkout
function. shtk uses this idiom throughout.