Bash 技巧 已翻译 100%

oschina 投递于 2019/01/03 11:00 (共 13 段, 翻译完成于 01-16)
阅读 2956
收藏 7
5
加载中

Bash is not the most programmer-friendly tool. It requires a lot of caution, low-level knowledge and doesn’t allow the slightest mistake (you know you can’t type foo = 42, right?). On the other hand, bash is everywhere (even on Windows 10), it’s quite portable and powerful, and in effect is the most pragmatic choice when automating tasks. Luckily, following a set of simple rules can save you from many of its minefields.

已有 1 人翻译此段
我来翻译

1. Shebang

There are a number of possible shebangs you can use to refer to the interpreter you want to execute your code under. Some of them are:

  • #!/usr/bin/env bash

  • #!/bin/bash

  • #!/bin/sh

  • #!/bin/sh –

We all know a shebang is nothing but the path (absolute or relative to current working directory) to shell interpreter, but which one is preferred?

Long story short – you should use #!/usr/bin/env bash for portability. The thing is that POSIX does not standardize path names, so different UNIX-based systems may have bash placed in different locations. You cannot safely assume that – for example – /bin/bash even exists (some of BSD systems have bash binary placed in /usr/local/bin/bash).

已有 1 人翻译此段
我来翻译

Env utility can help us workaround this limitation: #!/usr/bin/env bash will cause code execution under the first bash interpreter found in PATH. While it’s not the perfect solution (what if the same problem applies to /usr/bin/env? Luckily, every UNIX OS I know have env placed exactly there), it’s the best we can go for.

However, there is one exception I’m aware of: for a system boot script, use /bin/sh since it’s the standard command interpreter for the system.

It’s worth to check out this and this article for more information.

已有 1 人翻译此段
我来翻译

2. Always use quotes

This is the simplest and the best advice you should follow to save yourself from many of possible pitfalls. Incorrect shell quoting is the most common reason of a bash programmer’s headache. Unfortunately, it’s not as easy as important.

There are many great articles completely covering this specific topic. I don’t have anything more to say, but to recommend you this and this article.

It’s worth to remember, that you generally should use double quotes.

已有 1 人翻译此段
我来翻译

3. Variables usage

$foo is the classic form of variable referencing in bash. However, version 2 of bash (see echo $BASH_VERSION) brings us a new notation known as variable expansion. The idea is to use curly braces around variable identifier, like ${foo}. Why is this considered to be a good practice? It brings us a whole set of new features:

  • array elements expanding: ${array[42]}

  • parameter expansion, like ${filename%.*} (removes file extension), ${foo// } (removes whitespaces) and ${BASH_VERSION%%.*} (gets major version of bash)

  • variable concatenation: ${dirname}/${filename}

  • appending string to a variable: ${HOME}/.bashrc

  • access positional parameters (arguments to a script) beyond $9

  • substring support: ${foo:1:5}

  • indirect referencing: ${!foo} will be expanded to a value hold by a parameter whose name is stored in foo(bar=42; foo="bar"; echo "${!foo}" will print 42)

  • case modification: ${foo^} will modify foo‘s first character to uppercase, the , operator to lowercase. Theirs double-form (^^ and ,,) will convert all characters

已有 1 人翻译此段
我来翻译

In most common cases, using variable expansion form gives us no advantage over the classic one, but to keep code consistent, using it everywhere can be considered as a good practice. Read more about it here.

What you also have to know about variables in bash is that by default, all of them are global. This can result in problems like shadowing, overriding or ambiguous referencing. local operator restricts the scope of variables, protecting them from leaking to a global namespace. Just remember – make all your function’s variables as local.

已有 1 人翻译此段
我来翻译

4. Watch the script’s working directory

Within bash script, you will often operate on other files. Thus, you have to be really careful using relative paths. By default, the current working directory under script is derived from parent shell.

$ pwd
/home/jakub

$ cat test/test
#!/usr/bin/env bash
echo "$(pwd)"

$ ./test/test
/home/jakub

The problem exist when both pwd and script’s location differs. You cannot then simply refer to ./some_file, since it does not point to some_file placed next to your script. To be able to easily operate on files in script’s directory and avoid messing up random system files, you should consider using this handy one-liner to change subshell working directory to source directory of a bash script:

cd "$(cd "$(dirname "${BASH_SOURCE[0]}")" > /dev/null && pwd)" || return
$ pwd
/home/jakub

$ cat test/test
#!/usr/bin/env bash
cd "$(cd "$(dirname "${BASH_SOURCE[0]}")" >/dev/null && pwd)" || return
echo "$(pwd)"

$ ./test/test
/home/jakub/test

Looks much more natural, doesn’t it?

已有 1 人翻译此段
我来翻译

5. You don’t really need ls

The approach of ls usage inside bash script is almost always entirely flawed. I’m not able to recall even one reason to do this. To explain why, let’s go through two of common examples:

for file in $(ls *.txt)

Word Splitting will ruin this for-loop when any of filenames contains whitespace. What’s more – if a filename contains glob character (also known as a wildcard, like *, ?, [, ]), it will be recognized as a glob pattern and expanded by the shell. That’s probably not exactly what you want. Another problem is that POSIX allows pathnames contain any character except \0 (including |, / and even newline). This makes impossible to determine where the first pathname ends and the second one begins when dealing with ls output.

for file in "$(ls *.txt)"

Double quotes around ls will cause its output to be treated as a single word – not as a list of files, as desired.

How to iterate over list of files the right way? There are two possibilities:

for file in ./*.txt

This uses bash globbing feature mentioned above. Remember to double quote "${file}"!

find . -type f -name '*.txt' -exec ...

This one is probably the best solution. Find util lets you use regex-based search (-regex), recursion and has many other built-in features you may find useful. Here is a great synopsis of this tool.

find . -type f -name '*.txt' -print0 | xargs -0 ...

An alternative usage of find and xargs. It’s neither simpler nor shorter, but the advantage of xargs is that it supports parallel pipeline execution. Read more about the differences here.

To summarize, never try to parse the output of ls command. It’s simply not indented to be parsed and there is no way you can make it work. Read more here“.

已有 1 人翻译此段
我来翻译

6. Expect the unexpected

It’s often forgotten to check for non-zero status codes of commands executed within the bash script. It’s easy to imagine what would happen when our cd command preceding file operations fails silently (because of “No such file or directory” for example).

#!/usr/bin/env bash
cd "${some_directory}"
rm -rf ./*

An example above works well, but only if nothing goes wrong. The intention was to delete content of some_directory/, but it may end up executing rm -rf ./* in completely different location.

cd "${some_directory}" && rm -rf ./* and cd "${some_directory}" || return are the simplest and self-descriptive solution. In both cases, deletion won’t execute if cd returns non-zero. It’s worth to point out, that this code is still vulnerable to a common programming error – misspelling.

Executing cd "${some_dierctory}" && rm -rf ./* will end up deleting files you probably want to keep (as long as there isn’t misspelled some_dierctory variable declaration). "${some_dierctory}" will be expanded to "", which is entirely valid cd argument bringing us to home directory. Don’t worry though, that’s not the end of the story.

已有 1 人翻译此段
我来翻译

Bash has some programmer-friendly switches you should be aware of:

  • set -o nounset tells bash to treat referring to unset variables as an error. This one saves us from many typos mistakes.

  • set -o errexit tells bash to exit the script immediately if any statement returns a non-zero. One may say, that using errexit gives us error checking for free, but this can be tricky to use correctly. Some commands returns a non-zero for a warning and sometimes you know exactly how to handle particular command’s error. Read more here.

  • set -o pipefail changes the default behavior when using pipes. By default, bash takes the status code of the last expression in a pipeline, meaning that false | true will be considered to return 0. It may not be what you want, since this approach ignores errors raised by previous commands in pipeline. This is where pipefail comes in. This options sets the exit code of a pipeline to the rightmost non-zero one (or to 0 if all commands exit successfully).

  • set -x causes bash to print each command right before executing it (i.e. after globbing, arguments expanding). Definitely a great help when trying to debug a bash script failure.

Of course error handling problems applies not only to cd command described above. Your script should take into account vast majority of possible problems, like spaces in pathnames, files missing, directories not being created or non-existing commands (you know, awk isn’t always present in OS you’re about to run your script on).

已有 1 人翻译此段
我来翻译
本文中的所有译文仅用于学习和交流目的,转载请务必注明文章译者、出处、和本文链接。
我们的翻译工作遵照 CC 协议,如果我们的工作有侵犯到您的权益,请及时联系我们。
加载中

评论(0)

返回顶部
顶部