[Top level Bash Page] .... [References]
Shells, Command Lines, Variables, Init files, and Pipelines
This page introduces the Unix shell, some material applies generally to any Unix shell, other material might be specific to Bash.
The following material comes from many of the references however references [3-5] are likely the most relevant.
Terminology and definitions:
- A process can communicate with related processes (i.e., parent -child) using the Unix I/O abstraction (read/write) using a pipe.
- A pipe is an example of an Interprocess Commuications Channel (IPC) method.
- A shell is a command line interpreter. It parses input
stream looking for understood commands. When it finds one,
it forks a child process which runs 'exec command'..By
default the parent process waits. The shell does NOT wait if the command is followed by a & (runs the command in background).
- command arg1 arg2 ... argn
- Standard I/O : A shell starts off with two open file
descriptors
- 1: opened for writing (standard out)
- 0: opened for reading (standard in)
- Shell redirect operators ( $ cmd > file; #comment)
- The |& is short for 2>&1 which redirects standard error to standard output.
- $ls > myls.txt; #redirect standard out to a file
- $wc < myls.txt; #redirect standard in from a file
- $ls >>myls.txt; #redirect std out and append to a file
- $ls &>myls.txt; #redirect std out and standard error to a file
- $wc <&m; #duplicates std input from file descriptor m
- [n]>&m; #dupulicates std out or file desc. n to file desc m
- [n]<&- ; [n]>&- ; #closes std input or file desc n, close std out or file desc n
- Example: ls -lRt /home 2>/dev/null 1>myls.txt; #std error goes to the bit bucket and std out to myls.txt
- :
- Based on Ritchie/Thompson, programs executed by the shell start out with two open file descriptors, standard in and standard out (file desc. value 0 and 1). The standard error descriptor appears to have been added later.
- Pipes
- Takes the standard out from one command (usually referred to
as filters) and pipes it as standard input to a second
command
- Example: ps -aux | grep jjm ;
- The shell sets up an unamed pipe connecting ps's standard output to grep's standard input.
- Issues : buffering. If there's the need for the data in a pipeline to stream in a timely manner, sometimes buffering is an issue. The system might defer sending a filter output to the next filter in the pipeline until a certain amount of data has arrived or a certain amount of time has passed.
- Can pipe to stdbuf and/or use awk to help. Note the use of tee...it can be helpful debugging problems.
- ping -c 5 8.8.8.8 | stdbuf --output=0 awk '{print $1}' | tee >(wc)
ping -c 5 8.8.8.8 | stdbuf --output=0 awk '{print $1}' | tee ./tmp.out
- Command separators
- Example: cd ./mydir; ls
- Example: cd ./mydir;ls
& (background mode, parent process
does NOT wait for child process)
- Shell as a command
- Create a file with valid shell commands, call it
example1.script. Can run it several ways
- sh < example1.script
- ./example1.script
- A filter as an extension of the I/O abstraction to direct output from one command to the input of another - the shell creates the unamed pipe.
- Some commands are not filters , e.g., $date
- Some filters can also be commands, e.g., $grep "search" *.cc
Definitions/Terms:
- A simple shell command is a single command followed by its arguments. Complex shell commands can involve any number of simple commands arranged in a manner that uses pipes, lists, groups as defined below.
- commands separated by a ';' are executed sequentially.
- A pipeline as a sequence of one or more commands separated by a | or a |&
- The |& is short for 2>&1 which redirects standard error to standard output.
- A list is a sequence of one or more pipelines separated by one of the operators: ';', '&'; '&&' , '||' (assume left associativity)
- $cmd1 && cmd2 && ...cmdn : All commands executed sequentially left to right. The next cmd only runs if the previous succeeds.
- $mkdir tmpDir && cp -r mysrc tmpDir
- $cmd1 || cmd2 || .... cmdn : Commands run until the first command successed at which time all others are terminated
- $mkdir tmpDir || echo "mkdir failed"
- Lists can be organized with () or {} to be executed as a unit. The following serializes each group but allows multiple groups to run concurrently
- $(a ; b) & (c ; d) & #runs the two lists concurrently
- ( list ) : this runs the commands in the list in a subshell. Variable scope limited to the ( ).
- { list; } this runs the commands in the list in the current shell context (NOT a subshell). The ';' is required!
- Example:
- The following shows the scope of myVarA stays within a list, it also shows the ordering related to output is hard to predict due to internal I/O buffering
- (myVarA=hello; echo "first $myVarA"; ls -Rlt /etc ) | (echo "Second: $myVarA" ; more)
Second:
first hello
/etc:
total 1148
-rw-r--r-- 1 root root 89415 Feb 7 14:10 ld.so.cache
drwxr-xr-x 3 root root 4096 Feb 7 14:10 firefox..... (removed the rest of the output)
- Example
- (date ; ls ) | cat #redirects stdout from the list to cat
- Jobs : any command or pipeline running in background mode (use $jobs to view)
- Bash: A specific type of shell program called the Bourne Again Shell. Bash is the default shell for Ubuntu.
- dash is a lighter weight shell - Bash is built on top of dash. Dash is used in environments where shell performance matter
- Common Bourne Shell Variables
- HOME - example, '$cd $HOME'
- IFS - list of characters that separate fields. Used when the shell splits words as a part of expansion.
- PS1 - primary prompt string- deafult "'\s-\v'"
- Bash Shell Variables
- HISTSIZE
- HOSTNAME
- LINENO
- RANDOM #each time accessed gives a new random number 0-32767
- SECONDS - number seconds the shell has been active
Special characters and words
- control operator: A token that performs a control function. It is a newline or one of the following: ‘||’, ‘&&’, ‘&’, ‘;’, ‘;;’, ‘;&’, ‘;;&’, ‘|’, ‘|&’, ‘(’, or ‘)’.
- metacharacter A character that, when unquoted, separates words. A metacharacter is a space, tab, newline, or one of the following characters: ‘|’, ‘&’, ‘;’, ‘(’, ‘)’, ‘<’, or ‘>’.
- Special words include keywords (e.g., while), special parameters (e.g., positional param like $1)
- builtin commands such as printf
- quotes : are used to remove special meaning of characters or words (e.g., keywords).
- single quote : preserves literal value of characters (A single quote can not be escaped !!)
- Double quote: preserves literal value except for $, backtick (`), and the \
- Examples
- myVar=hello;
- echo myVar ----displays : myVar
- echo $myVar --- displays hello
- echo "myVar" --- displays myVar
- echo "$myVar" --- displays hello
- echo '$myVar' --- displays $myVar
- ANSI-C Quoting= "$'STRING'" expand to strings with backslash esc characters replaced by ANSI-C equivalents.
- Backtick - everything within backticks is evaluated/executed by the shell before the main command (as long as the backticks are not surrounded by single quotes)
- Backslash- If non quoted it is the Bash escape character preserving the literal value of the next character
- Examples
- myVar=hel\&lo
- echo $myVar ---> hel&lo
- Backslash at the end of a command line continues it.... terminated with CNT-C
The following is a brief explanation of how the shell works (see ch 3 or ref 3).
- The shell reads its input from a file, from a string or from the user's terminal.
- Input is broken up into words and operators, obeying the quoting rules, see Chapter 3.
- These token are separated by metacharacters. Alias expansion is performed.
- The shell parses (analyzes and substitutes) the tokens into simple and compound commands.
- Bash performs various shell expansions, breaking the expanded tokens into lists of filenames and commands and arguments.
- Redirection is performed if necessary, redirection operators and their operands are removed from the argument list.
- Check whether the command contains slashes.
- If not, first check with the function list to see if it contains a command by the name we are looking for.
- If command is not a function, check for it in the built-in list.
- If command is neither a function nor a built-in, look for it analyzing the directories listed in PATH. Bash uses a hash table (data storage area in memory) to remember the full path names of executables so extensive PATH searches can be avoided.
- If the search is unsuccessful, bash prints an error message and returns an exit status of 127. If the search was successful or if the command contains slashes, the shell executes the command in a
separate execution environment.
- If execution fails because the file is not executable and not a directory, it is assumed to be a shell script.
- If the command was not begun asynchronously, the shell waits for the command to complete and collects its exit status.
Variables -
- A parameter is an entity that stores values- it can be a name, a number of a special character (like $1)
- A variable is a paramter denoted by a name. It has a value and zero or more attributes
- Attributes are assigned using declare
- Variables can hold content that can be treated as: strings, integers, constants, arrays
- Examples
- declare -i myInteger=3 #an integer
- Arrays....two methods
- declare -a myArray
- myArray=(1, 2, 3)
- readonly myConstant=SHOW_DEBUG
- myStringVar=This is a string #default is a string
- global or local
- run the 'env' command to see all global variables
- session persistent : edit
~/.bashrc
- system wide environment variables.....lots of possible
locations
- /etc/profile
- /etc/bash.bashrc
- Also: files ending in sh in /etc/profile.d get
executed when a shell starts
- local - variables that persist just for the life of a shell session or even within a function of a script
- Shell versus user variable
- local, inheritance, export to other sessions
- local implies a shell variable
- myFile1='myfile.txt'
- echo $myFile1
- inheritance: child processes inherit all env
variables
- export myFile1='myfile.txt' - this promotes the shell
variable to an environment variable
- Variables and quotes
- the shell treats strings enclosed in single and double
quotes the same except double quotes are subject to globbing
- the expansion of filename-matching metacharacters such as
* and ? AND variable expansion
- myFile1='myfile.txt'
- echo "the file is $myFile1"
- echo 'the file is $myFile1'
- Special Shell Variables
- $NAME - indicates the value of the parameter NAME
- $? - value of last return code
- $$ - process ID variable
- String variables -
- stringA=abcd
- echo ${stringA} #abcd
- echo ${#stringA} #4 the strlen
- echo ${stringA:2:4} #cd
Shell expansion is performed after each command line has been split into tokens. More specifically, variables undergo any of the following expansions
- Brace expansion : {}
- Extended brace extension:
- echo {a...z} #a b c .... z
- echo {0..3} #0 1 2 3
- echo sp{el,il,al}l #spell spill spall
- base64_charset=( {A..Z} {a..z} {0..9} + / = ) #inits array
- {} or curly brackets is an inline group -
- {} : placeholder for text
- ls . | xargs -i -t cp ./{} $1
- Tilde expansion
- Parameter and variable expansion
- echo $1
- echo ${myString:6-10} #substring of myString
- Command substitution: can use either $(command) or using backticks `command` - both expand with the result of the command which is then passed to the outer command as a parameter.
- echo `date` #the date command runs and generates a line of text which is
- kill $(ps aux | grep '[p]ython csp_build.py' | awk '{print $2}') #This is explained in more detail here.
- Arithmetic expansion - (( expression ))
- a=2;b=3
- echo $((a + b)) #3
- echo $((--a)) # 3
- echo $((a++)) #3
- Word splitting -
- Filename expansion - globbing used "*", "?" and "["
- touch myfile1.txt myfile2.txt myfile11.txt
- ls myfile?.txt # myfile1.txt myfile2.txt
- ls myfile??.txt # myfile11.txt
Built in Commands
- alias - create shortcuts!!!
- set in ~/.bashrc
- NOTE: can use bash functions in your .bashrc for the
same purpose....and this is more versatile.
- exec : executes a command or redirects a file
- exec date #executes the command but exits
- exec < infile #input from infile rather than stdin
- exec > outfile 2>errfile
- exec n>outfile //opens the file 'outfile', n is the file descriptor
- exec n<&m //n is now a duplicate file descriptor of m
- example
- exec 3>myOutFile.txt
- exec 4>myStdErrOutFile.txt
- echo "this is going to a file " 2>4 1>3
-
- history:
- echo $HISTFILE $HISTSIZE $HISTFILESIZE
- history 5
- fc 60 //puts you into the nanoeditor
editing command #60, when you exit it runs the command
- supports a cryptic syntax for finding and editing prior commands
- !fi #runs most recent command beginning with fi (for me this finds a 'find')
- printf: can be used instead of echo to print.
- printf "$0 : `date` The number of args is $
- read : accept user input.
- Example if1.sh
- echo -n "word 1: "
- read word1
- echo -n "word 2: "
- read word2
- if test "$word1" = "$word2"
- then
- fi
- echo "End of program."
- source : runs the command in the current process, unlike exec source will not exit the current shell.
- source ~/.bashrc #run init
- the '.' is a syonym
- test : checks file types, values.
- Examples: these are identical as the [] issues a test and [[ ]] is a newer form of test
- if test $# -eq 0
- if [ $# -eq 0 ]
- if [[ $# -eq 0 ]]
- trap : catches signals
- !/bin/bash trap
- 'echo PROGRAM INTERRUPTED; exit 1' INT
- while true
- do
- echo "Program running." sleep 1
- done
- type : provides info about a command
- Example: $type ll gcc exec
- ll is aliased to `ls -alF'
- gcc is hashed (/usr/bin/gcc)
- exec is a shell builtin
- Built in commands that are symbols
- the dot '.' - synonym for source - run a command in current process
- () #subshell
- $() #cmd substitution
- (()) #arithmetic eval (synonym for let)
- [] #test
- [[]] #conditional exp, similar to [] but adds string comparisons
Patterns
There three kinds of pattern matching in Bash : globs, extended globs, and regular expressions. Bash does not support regex however utilities, filters, commands use regex (e.g., sed, awk, grep).
Glob Patterns (refer to this chapter in the Advanced Bash Scripting Guide)
Globs are composed of normal characters and metacharacters. Metacharacters are characters that have a special meaning. These are the metacharacters that can be used in globs:
*: Matches any string, including the null string.
?: Matches any single character.
[...]: Matches any one of the enclosed characters.
Extended Globs
Bash also supports a feature called Extended Globs. They are similar to regular expressions. Extended globs are disabled by default and we will try to avoid them in our scripts.
Regular Expressions
Regular expressions (regex) are similar to Glob Patterns, but they can only be used for pattern matching, not for filename matching. We will pay particular attention to the use of regex in the following situations:
- Grep command
- Substitution in sed (also applies in vi)
- Possibly with awk, and diff
- Other situations might be added....
Refer to this page for more details on regular expressions.
Last update: 4/29/2018