Skip to content

Shell Scripting

Shell, or command-line shell, is a general term for programs that read commands from the keyboard, execute them, and display the output on the screen. Before the invention of the Graphical User Interface, the Command Line Interface with various command-line shells was the only way to interact with a computer.

Command-line shells have a number of advantages over graphical interfaces.

  • They consume far fewer computational resources than the graphical shell of the operating system.
  • Programming for a command-line shell is much easier than for a graphical interface.
  • Graphical interfaces of programs are difficult, and sometimes practically impossible, to port across different operating systems, for example from Linux to Windows.
  • Operations in the command-line shell can be easily automated, i.e., scripted.

There are currently many different command-line shells: - bourne shell, or sh — the oldest command-line shell. All Unix-like operating systems have this, or a fully compatible command-line shell program, even if it is not used by any of the computer's users.

  • bash (an acronym for Bourne Again SHell) — an improved command-line shell written for the GNU Project as a free replacement for sh. bash is used in all Linux distributions, it is also installed by default on macOS and can be installed on Windows.

  • Zsh — a partially bash-compatible command-line shell that is the default on macOS. It was created as an alternative to bash and is more focused on interactive terminal usage than on scripting.

  • cmd and PowerShell — Windows command-line shells. cmd is older and simpler, while PowerShell is new, written from scratch, with significantly extended capabilities for both interactive use and automation.

The commands and syntax of each shell differ. Since we are studying operating systems using Linux as an example, we will focus on working with bash.

A bash script

Let's create a simple bash script that will print a message to the terminal. This can be done in any text editor:

nano script.sh

💡 The .sh suffix at the end of the file is not mandatory, but it is customary to add it. It makes it easier to distinguish a bash script from any other file type.

Add the first line to the file:

#! /bin/bash

This line is called sharp exclamation, or shebang for short. The hash # combined with the exclamation mark is a special character combination that in Unix-like operating systems specifies the command-line interpreter that should execute this script — in our case, bash. Also note that we added the full path to the file /bin/bash, which is the same regardless of the Linux distribution. Any interpreter can be used in place of bash: sh, Zsh, and even Python.

If you do not specify a shebang, the operating system will try to execute scripts with whatever shell you are calling it from. For example, if you write a bash script and try to run it on macOS, it will be interpreted as a Zsh script. This may cause the script to work incorrectly. So always specify the shebang.

After the shebang, you can write code — a sequence of the necessary commands. Let's add the echo command to our script, which writes to stdout whatever you pass to it (analogous to the print function in Python):

echo 'hello, world!'

Let's save and execute the script:

bash script.sh
sh script.sh

It is much more convenient to execute scripts by changing the file permissions. The default permissions only allow reading and writing the file. Let's check:

ls -l

To allow executing the script as a program, you need to run one of the following commands:

chmod 771 script.sh
chmod +x script.sh

The second command adds execution permissions for all user types but does not affect already set read and write permissions (unlike the first one). Now the permissions allow executing the script as a program. Such files are displayed in green in the output of the ls -l command.

Let's execute our script. Note that to do this, you need to specify the path to the file (even a relative one will work), not just the name:

./script.sh

Congratulations! Your first bash shell script is ready.

Variables

bash supports variables. Here is an example of variable declarations:

a=123
b="testvalue"

If the variable's value is a string that contains spaces, single or double quotes are mandatory.

To use a variable, you need to prefix its name with $:

echo $a
echo $b

In bash, you can substitute variable values into strings:

echo "my variables: $a $b"

This substitution is called string substitution. It happens when you write strings in double quotes; in single quotes, it will not work.

💡 Between double quotes there can be text, variables, or both (relevant when using user variables and shell variables in scripts), while content between single quotes is always treated literally.

As in any programming language, you can use the values of some variables to derive the values of others:

c="my variables: $a $b"
echo $c

bash also supports arithmetic operations, but they have a specific syntax. The arithmetic operation itself must be written in double parentheses, and variables must be specified without $:

a=1
b=2
((a+b))

The result of the previous operation will not be saved anywhere, and will not even be printed to the console. To be able to do something with it, you need to add $ before the parentheses — then you can save it to another variable or pass it as a parameter to another command:

с=$((a+b))
echo $c
echo $((a+b))

Besides regular variables, there are several other groups of variables you can use. The first group is script arguments. When you execute a script, you can specify any number of arguments separated by spaces:

./script.sh argument1 argument2 "argument 3"

You can refer to these arguments using special variables:

#! /bin/bash

echo "referring arguments based on their position: "
echo "first argument: $1, second argument: $2"

echo "total number of arguments: $#"

echo "all script arguments: $@"

Each argument can be read using a variable whose name is the positional number of the argument. The total number of arguments is stored in a variable denoted by #, and all arguments at once can also be read from a variable denoted by @. Let's execute this script and see how it works:

./script-args.sh arg1 arg2 arg3

bash also has a number of additional built-in variables whose values we can use in scripts:

#! /bin/bash

echo "The exit status of the last process to run: $?"

echo "The Process ID (PID) of the current script: $$"

echo "The number of seconds the script has been running for: $SECONDS"
echo "A random number: $RANDOM"
echo "The current line number of the script: $LINENO"

echo "The hostname of the computer running the script: $HOSTNAME"
echo "The username of the user executing the script: $USER"

Environment variables

Environment variables are certain values that are defined in the operating system and used to configure the working environment. Each process has its own set of environment variables. When one process creates another, the child process receives a copy of the parent process's environment variables. Additionally, the operating system allows users to add or override environment variables for processes themselves.

You can view the list of all environment variables for the current process with the following command:

printenv

To create an environment variable, you need to use the export command:

export testvariable="testvalue"
printenv

If you write a script that creates environment variables and execute it, those variables will be set only for the process in which the script was executed.

To copy all environment variables from the script's process into the current bash process (which is the parent), you need to use the source command:

source ./script-environment-variable.sh

This command has a shorthand syntax:

. ./script-environment-variable.sh

A variable created with export will persist only for the current bash process. You can create a permanent environment variable in several ways depending on the level at which you need to do it. If you want a variable to be created for your user every time you start bash, you need to add it to the end of the .bashrc script in the format export envFromBashRc="testvalue":

nano ~/.bashrc

.bashrc is a script in your home directory that is executed automatically every time you start bash. To see this variable, you need to either restart bash or re-source the script:

. ~/.bashrc

If you need to add a variable for all users, you should use the global script:

nano /etc/bash.bashrc

And the last place where you can add environment variables is the global configuration file. Variables from this file are used for all processes:

sudo nano /etc/environment

💡 If you make changes to the /etc/environment file, you need to reboot the computer to see them.

The /etc/environment file contains the PATH variable, which holds a list of all directories where the operating system looks for program files. For example, when you run cat:

cat /etc/environment

The command-line shell identifies cat as the program name (simply because it comes first), and everything after it is stored as the program's arguments. It then asks the operating system to execute the cat program with the given arguments, and the operating system searches for the program file in all directories listed in the PATH variable.

Environment variables are convenient to use both for configuring shell scripts and for configuring any programs. They are very easy to read, and they work the same way on Linux, macOS, and Windows.

Useful programs

Count lines or words, find a word in text, make a request to a web server, or download a file — there are ready-made built-in programs for all of this. Knowing them, you can write a bash script that does practically anything.

The first program is wc, which is designed for text analysis:

man wc

This program can count the number of bytes, characters, words, or lines of text, and it accepts the text via stdin. If you don't pass any additional parameters, it will count all of the above, and if you only need to count, for example, the number of lines, you can pass the corresponding parameter:

wc -l

You can close the input with the key combination Ctrl + D.

cat is another useful program. If called without parameters, it outputs its stdin to stdout. If you pass it a file name, it outputs the file's contents to stdout. cat is convenient for viewing small files:

cat testfile

The grep command is used to search for words in text. Its first parameter is the word to find, and the second is the file or files in which to search. grep can search both in files and in text sent to its stdin:

grep text ./testfile

Another great program that is useful to know when writing scripts is curl, which lets you make HTTP requests:

man curl
curl https://curl.se/docs/tutorial.html

With curl, you can customize requests to web services as you wish: change the HTTP method, use the needed HTTP headers, and more. You can also view the full response information from a web service using the -v (verbose) parameter:

curl -v https://curl.se/docs/tutorial.html

If you need to download a file, you can do so with wget. This command has many customization options for the request, just like curl. You only need to provide the file name where wget will save what it downloads:

wget https://curl.se/docs/tutorial.html -o "tutorial.html"

Redirecting I/O streams

Linux processes have three I/O streams: stdin, stdout, and stderr. Let's start with stdout. In bash, you can redirect it using the > symbol:

ls > ls-output-file

To the left of it should be the command whose output you want to process, and to the right — the file name where you want to save the result. If you use only one > symbol for redirection, the file will be recreated each time:

ls > ls-output-file
cat ls-output-file

To avoid recreating the file and simply append the output to the end, use two > symbols:

ls >> ls-output-file
ls >> ls-output-file

If we execute a non-existent command, the terminal will display an error:

lss

If we try to redirect the command's output to a file, this error will not be saved:

lss > err-log

The > symbol redirects only stdout, while errors are written to the stderr stream. It can be redirected to a file like this:

lss 2> err-log

stderr does not conflict with stdout in any way. If needed, we can redirect them to separate files:

lss > info-log 2> err-log

Linux has a special file that accepts redirected output streams but doesn't write them anywhere — /dev/null. It can be useful if you don't want to save or display program logs:

lss 2> /dev/null

Pipe

Sometimes you may need to send the output of one program as input to another. For example, you need to find all bash processes in the output of the ps command using grep. Of course, you could first execute the first command, save its output to a file, and then search that file:

ps -ax > processes
grep bash processes

But there is a much more efficient way called pipe. Here is how pipe usage looks in practice:

ps -ax | grep bash

The vertical bar between the ps and grep commands is the pipe. Thanks to it, we were able to execute the ps command and pass its output as input to the grep command, which found all bash processes for us.

Let's look at pipe in more detail and run the Python script process-demo.py:

import time

# delay so we will have enough time to start the program
time.sleep(30)

# do some calculations
sum = 0
for x in range(20000):
    sum = sum + x
    print(sum)

# emulate waiting for an external event
time.sleep(300)

This script will perform some calculations and write the result to stdout. Now let's combine this script with the cat program, which will read stdin and save it to a file using redirection. To make it easier to find the processes created for these commands, let's run them in the background. And to more easily see which processes were started, let's check the process list before executing the command:

ps
python3 process-demo.py | cat > process-demo.log &
ps

Two separate processes appeared here: one corresponds to our Python script, and the other to the cat program. Pipe connects these processes by redirecting the stdout stream of the first process to the stdin stream of the second process. The second process will be in a blocked state until the first one sends data. Everything sent by the first becomes input for the second process. If the first process finishes its work, the second process launched through pipe will also finish.

Let's look at another example and count the number of processes using wc:

ps | wc -l
ps

The output of the first command shows that there are 4 processes in total, although the second shows only 2. The thing is that wc -l counts the number of lines that went into stdin, and ps also includes the header row with columns and one additional empty line at the end. So to find out the real number of processes obtained this way, we need to subtract 2 from the count.

💡 Pipe is not magic — it's simply the redirection of the output stream (stdout) of one program to the input stream (stdin) of another. You need to be very careful when using pipe in bash scripts.

Conditions and loops

bash supports the if conditional operator. Its syntax looks like this:

if [ "$1" = "hello" ]
then
    echo "hello! how are you?"
fi

The condition is an expression in square brackets. Here are some conditions you might find useful:

string="value"
# string conditions:
[ "$string" = "value" ] # true if variable is equal to some value
[ "$string" != "anothervalue" ] # true if variable is not equal to some value
[ -z "$string" ] # true if string is empty
[ -n "$string" ] # true if string is not empty

Note that all variables are in double quotes — that is how strings are compared in bash. To check if a string is empty, use the -z operator, and to check that a string is not empty, use the -n operator. Also pay attention to spaces: while when declaring a variable in bash we do not separate the variable name, the equals sign, and the value with spaces, in conditions you must separate variables, values, operators, and brackets with spaces.

You can also use loops in bash, for example:

#! /bin/bash

counter=1
while [ $counter -le 10 ]
do
    echo $counter
    ((counter++))
done
echo "All done"

The loop condition is declared the same way as for the if conditional operator, and the loop body is delimited by the keywords do and done, between which the commands to be executed are placed.

Now let's practice a bit and write a script that checks who has used the sudo program over the last couple of days. The sudo logs are stored in the /var/log/auth.log file — let's perform some operation with it and check the file contents:

sudo apt-get update
cat /var/log/auth.log

This log contains many different entries, and only one of them interests us — the one that contains sudo and the actual command. Let's create a new script (sudo-monitor.sh) that reads this log file and outputs all such lines:

#! /bin/bash
log_file="/var/log/auth.log"

echo "Checking the log for sudo usages"
cat $log_file | while read line
do
    sudoLogRecord=$(echo "$line" | grep "sudo" | grep "TTY")
    if [ -n "$sudoLogRecord" ]
    then
    echo "$sudoLogRecord"
    fi
done

Let's look at this script in more detail:

  • /bin/bash — this is the shebang.
  • log_file="/var/log/auth.log" — this is the log file name that we save in a variable.
  • echo "Checking the log for sudo usages" — this is a logging command for the script, making it easier to understand what it does when we run it.
  • cat $log_file | while read line — this is the command that reads the file contents, and the read program reads only one line from stdin and saves it to a variable. When used in a loop condition where the entire file content is passed through pipe, the loop iterates through all lines.
  • sudoLogRecord=$(echo "$line" | grep "sudo" | grep "TTY") — inside the loop body we can find the line we need using grep. If $line does not contain sudo, the result of grep will be an empty string. Looking at the log, commands can be filtered by the presence of TTY. We save the filtering result to the sudoLogRecord variable. If this line has the words we need (sudo and TTY), this variable will contain the matching line; otherwise, the string will be empty.
  • if [ -n "$sudoLogRecord" ] — this checks whether the string is not empty; if so, we output this line to our script's log (echo "$sudoLogRecord").
  • -n string is not null. -z string is null, that is, has zero length

Now let's save and run this script:
chmod +x sudo-monitor.sh ./sudo-monitor.sh
As you can see, it output only the log line that interested us — the message about `sudo` usage by a user. Now we can also add this script to `~/.bashrc` so it runs every time we log into the system:
echo "~/sudo-monitor.sh" > ~/.bashrc bash ```