One of the things professors of computer science like to believe is that it doesn't matter which programming language you're going to use, that the important thing is to learn to program.

This writeup is a comparison of bash to Python, in the framework that Jongleur set up in his Learn to Program writeup. I don't actually think you should learn to program bash as your first language. It is, however, a very useful language to know if you spend a lot of time logged into a terminal, and it's also nice to be exposed to different languages as you're learning them. One of the key things about programming in real life is that you often don't get to choose which language you'll be working in. So it's best to learn early on how to learn a new language: what sorts of things to look for as differences and similarities, and how to work around the limitations of a language.

It's possible to write real programs in bash, and if you can manage to do that, you can probably learn any other language too. bash has one huge advantage over Python: it's already installed on every Linux machine, and most UNIX machines as well. These days, it's even installed on Macs, although the majority of users like to remain blissfully unaware of the powerful command line interface that sits just a click away... (If you use Windows, you'll have to install it, but nobody actually writes scripts for Windows, right?)

It's worth learning to program in a nice programming language first. But it's also worth learning that you can do everything in that language in another, more clunky language. And who knows? Someday you might need to write a huge program in a clunky language, so it'll be nice to be able to think in terms of nice programming principles while you're doing it.

I'm not going to analyze any of the programs in depth. They're invariably translations of programs from Jongleur's writeups. Each heading below is a link to Jongleur's writeups on the same topic.

Producing Output

The instructions for writing a program in bash are amazingly similar to those for Python. One of the primary reasons for this is that both Python and bash are interpreted languages. The first step is to put your program in a file.

So, let's call the first program first.sh. Type or paste the following:

echo 'My name is Grae'
echo
echo 'Oh my god.' 'This actually worked!'
In order to run this program, you'll go to the directory you saved it in, and type bash first.sh.

If you'd like to do things with escape characters (like \n), you'll have to tell echo that your string contains these characters, by passing it a -e. You can still use the same escape characters available in Python (or C, C++, Java...). An example:

To print:

Hello!
This is on another line.
You could run a single echo command:
echo -e 'Hello!\nThis is on another line.'

In bash, it's also possible to use double-quotes to encode strings. However, they'll act a little bit differently. You'll need to escape various characters in order to get them to show up, because otherwise bash thinks you're trying to do something else. For example, the ! character is special (it's the history expansion character), so if I wanted to write the program above using double quotes, I'd have to say:

echo "My name is Grae"
echo
echo "Oh my god." "This actually worked\!"

Math

In bash, expressions are assumed to be strings unless otherwise indicated. What I mean by this is that if you put 3 somewhere in your bash program, bash will treat it like a string, not a number.

In order to get it to act like a number, you need to surround it by $[ and ] characters. Inside these brackets, you're allowed to do any sort of math that bash recognizes. Example:

echo 3
echo $[3+3]
echo $[3*3]
echo $[3/3]
echo $[3-3]
echo $[((3+3)*(3*3)/(3/3))-(3-3)+3]
echo $[3+3*3*3/3/3-3-3+3]
echo $[3**3]

Unfortunately, bash doesn't understand floating point math. So if you try to do something like echo $[3.0+2.0], you'll get an error message:

bash: 3.0+2.0: syntax error in expression (error token is ".0+2.0")

There are various tricks you can use to get a decimal point into integer math, but they're sort of annoying to use, and they're a bit of a hack anyways. The answer that most bash programmers use is to just call a program that knows how to do floating point math.

The simplest way to do this is to use bc.

echo "(98.6-32)*5/9" | bc -l
By using another program, you get to avoid reinventing the wheel. This is an important technique in bash programming: there are a lot of programs out there that already do the things you might want to do.

This usually isn't a major limitation in programs. You'll want to be careful of using floating point numbers to store things that you actually care about getting accurate numbers for, due to round-off error. If you ever want to write something that's going to keep track of money (or anything else that needs to be accurate), look into using fixed point arithmetic. Oh, and you probably don't want to be using bash for it either.

Variables and Function Calls

bash also has both variables and functions. It's rather difficult to program without these. I think the best way to think of both of these concepts is as a name for some other thing. Variables encapsulate a piece of data; functions encapsulate a way to do something. This distinction is a little artificial, and most languages have some way to treat functions as just another form of data.

Variables

Variables in bash work much the way they do in python, with a couple of differences:

  • you can't put spaces next to the equal sign of an assignment operation. (And people complain that python is dependent on whitespace...)
  • when you reference a variable after it has been set, you prefix it with a $ to let bash know it's not a string.
An example:
majorstring="I am the very model of a modern major-general"
echo $majorstring
echo $majorstring
echo $majorstring
echo $majorstring
echo $majorstring

The temperature conversion program would look like:

celsius=22
echo "The temperature is $celsius degrees C, or $[$celsius*9/5+32] degrees F"
(Unlike Python (but like Perl), it's typical to embed variable substitutions in the middle of strings in bash programming.)

The next example: changing a variable:

row=1
echo "$[row*0] $[row*1] $[row*2] $[row*3] $[row*4] $[row*5] $[row*6]"
row=$[$row+1]
echo "$[row*0] $[row*1] $[row*2] $[row*3] $[row*4] $[row*5] $[row*6]"

Functions

In bash, calling a function and running a program look exactly the same. When writing a bash program, it's useful to think of every other program that's installed on your computer as just another function.

In order to read input in bash, just use read name. This would set the variable $name to whatever the user types. Example:

echo "What is your name? "
read name
echo "What is your age? "
read age

year=$(date +"%Y")
echo "Your name is $name"
echo "You are $age years old"
echo "you were born in the year $[$year-$age]"

I snuck in another function call that didn't exist in the Python version of this tutorial. In the process, I show how to grab the output of a function (just surround it by $(...)). One reason I do this is because I'm lazy; I don't want to have to change this write-up whenever the year changes. And an important thing to remember when programming: a good programmer is a lazy programmer. At any rate, the date command tells you the date, and the string after the + tells the format you want. "%Y" turns into the year.

The temperature conversion program would look like:

echo "Enter degrees Celsius"
read celcius
fahrenheit=$[(9*$celcius)/5 + 32]
echo "$celsius degrees C is $fahrenheit degrees F"


(the astute reader might notice that I haven't actually introduced any functions yet... read is actually a special statement in bash, and date is a separate program. Still, the syntax you use to call them is the same, so I'm going to ignore that for now. bash doesn't really have any built-in functions (they're all special statements or add-on programs), so until you write your own functions, you can't really call one.).

Types and Objects

The concept of types is an important concept in programming. The comparison between Python and bash will illustrate some of the major differences between different programming languages.

The type system of bash is simple. Everything's a string. It turns out you can interpret strings as lists or numbers, but deep down, everything is considered a string. In some ways, this is a bad thing: if we have to convert from a string to a number and back again every time we do some math, it's going to take a long time to do simple arithmetic. This is one of the major reasons you'd never want to do molecular modeling in a bash script: it doesn't make that sort of thing efficient.

It's not entirely true that the type system of bash is this simple. As of version 2.0, bash supports arrays. However, I think it's a bit more useful to learn a dialect of bash that will work on more computers, so I'm going to ignore their existence.

Another difference between bash and Python: functions aren't a data type in bash. Although because bash is an interpreted language, you can get a lot of the benefits of having functions as data types by writing programs which generate their own code dynamically. At this point, though, it's unlikely that you'll want to do anything like that.

bash supports the concept of code modules. In order to import a module in bash, you say source module. Alternatively, you can use . as an abbreviation for source (and most programmers do).

If/Then

Programming languages are pretty useless if you can't do something different based on different values of a variable. The basic format of a bash if statement follows:

if condition_to_test; then
  actions_to_run_if_condition_passed
else
  actions_to_run_if_condition_failed
fi
The condition_to_test is just a call to a command or function. Every function or program returns a code telling whether it succeeded or failed (this is stored in the special variable $?.)

An example:

echo 'What is your name?'

read name
if [ "$name" = 'grae' ]; then
  echo 'you wrote this program'
else
  echo "Your name is $name"
fi
You can even make a pretty dumb game:
number=7
echo 'What number am I thinking of?'
read guess
if [ "$guess" = "$number" ]; then
  echo 'You got it!'
else
  echo "Sorry, I was thinking of $number"
fi

So up above, I mentioned that the condition of the if statement was a command. The most common command to use for this is the test command, and [ is an abbreviation for test. The [ command allows you to do various comparisons. The [ command takes all sorts of options to do comparisons. It's generally considered good practice to quote strings that you use inside comparisons, to protect against bad data that can get put into the string. In the above example, if the user simply hit enter, guess would be an empty string, and [ would barf when it was used to run [  = 7 ].

One interesting note: if you use =, you're doing a string comparison. If you want to compare if two numbers are equal, use -eq. (The numeric comparisons consist of two-letter abbreviations). An example:

if [ \( \( 1 -lt 2 \) -a \( 4 -lt 3 \) \) -o ! \( 5 -lt 4 \) ]; then
  echo it was true
else
  echo it was false
fi
As you'll see from the above example, using bash for numeric comparisons is a bit annoying. Parentheses are special characters, and must be escaped with backslashes. Also, you need to know that ! means "not", -a means "and", and -o means "or". These last two are only true inside a [; outside of a test, you can use && and ||. An equivalent version of the previous program:
if ( [ 1 -lt 2 ] && [ 4 -lt 3 ] ) || !  [ 3 -lt 4 ]; then
  echo it was true
else
  echo it was false
fi


In some environments, [ is a built-in command, and in others it's actually a program that gets called. This is true of a few of the "standard" bash functions.

Loops

Like comparisons, loops are essential to good programming. They're the way to run a statement multiple times without writing the same thing over and over.

for loops

The most common sort of loop is the for loop. The basic syntax of the for loop follows:

for name in list; do
  instructions
done
This executes instructions once for each item in list, setting name to each item in turn. In bash, a list is just a string separated by whitespace. Example:
words="what is going on?"
for i in $words; do
  echo $i
done
This program will echo each word in $words on a separate line.

A fairly common thing to do is loop a certain number of times. The easiest way to do this is to loop over a list with the right number of elements. bash doesn't have any nice built-in functions for constructing lists, but once again, a stand-alone program comes to the rescue. The seq program, given two numbers, constructs a list counting from the first number to the second number. So if I wanted to print my name 12 times, I could write:

for i in $(seq 1 12); do
  echo "Grae"
done

while loops

While loops are the most general loops in programming, and can be used to emulate each of the other sorts of loops that exist. It's often a bit annoying to do this, though, so you probably want to use whichever specialized loop is most appropriate for your application.

In a while loop, the body of the loop is executed as long as the specified condition is true. An example:

answer=7
echo "What number am I thinking of?"
read guess
while [ "$guess" != "$answer" ]; do
  if [ "$guess" -lt "$number" ]; then
    echo "Nope, too low"
  elif [ "$guess" -gt "$number" ]; then
    echo "Nope, too high"
  fi
  echo "What number am I thinking of?"
  read guess
done

Functions

One of the most important things to figure out when programming is how to write the code so that it's not overly repetitious. There are various reasons that this is a good thing. One of the best is that if you repeat part of your program that's buggy, you only have to fix it in one place.

The typical way of accomplishing this modularity is by writing functions. So, how could we write a function that tests if a string is two words long, has a first word that starts with 'a', and a second word that starts with 'b'?

function conformsToSpec {
  if [ -n "$3" ]; then
    return 1
  fi
  [ "${1#a}" != "$1" ] && [ "${2#b}" != "$2" ]
}

This snippet of code is full of new concepts. The first thing to notice is that a function stops running when it hits a return statement. The number that gets passed to the return statement is the return value of the function, and that's what we can use in an if statement. 0 means true, and any other number means false (this is completely backward from most other languages).

The first line says that we're defining a function, and it's called conformsToSpec. The curly braces surround the body of the function.

The arguments to the function are referred to as $1, $2, $3, etc. So the first test (the next three lines of the function) are doing something with the third argument. Above, I didn't mention the -n flag to the [ command; it tests to see if a string is a non-empty string. So we're testing to see if there's a third word, and if so, we return a non-zero value.

There's a second way a function can return a value. If no other return value is specified, a function returns the same as the last command it called. So in this case, we're returning the results of the last test.

bash isn't very good at string processing, but there are a couple of things you can do without resorting to calling other programs. One of them is to cut off a given prefix of a string. This is done by saying ${variablename#prefix}. If $variablename starts with prefix, this expression turns into the variable with the prefix cut off. So, for example, if $a is "foo", ${a#f} is "oo".

In our example above, ${1#a} is the same as $1 only if $1 doesn't start with the letter 'a'. Likewise, ${2#b} is the same as $2 only if $2 doesn't start with the letter 'b'. So if both strings compare differently, we know that the first word starts with 'a' and the second starts with 'b'.

So now that we have a function, how do we go about using it? We could do something like the following:

for str in "a bouncer" "Apple bat" "apple" "a b c d" "ab" "booger alpha"; do
  if conformsToSpec $str; then
    echo "$str is a conformant string."
  else
    echo "$str doesn't conform."
  fi
done

This particular example wasn't all that useful, so let's write a function that does something a bit more interesting, converting temperatures from fahrenheit to Celcius.

function convertTemperature {
  celcius=$[($1-32)*5/9]
  echo $celcius
}
This function is a little bit different than the last one, in that we're not returning the interesting bit in the return value (which will always be 0, because echo always returns 0, and it's the last thing we call in our function.) Instead, we're returning it by printing it out, and we can use $(...) to capture it in a variable we specify later. For example, if we wanted to convert the boiling point of water, we could say boilingPoint=$(convertTemperature 212).

Libraries and Modules

It's hard to write an interesting program that's also short. And once programs get to a certain length, you're probably going to want to separate bits of your code out into other files.

Unlike most other languages, bash doesn't have a concept of separate namespaces. So once you include a file that has all your functions in it by saying . filename, you can call any function that was defined in that file. (Incidentally, it will also run any code that wasn't inside a function definition, so you'll have to be a little bit careful about what sorts of files you include in this way.)

Conclusion: Why would I want to learn bash?

Like I mentioned up in my introduction, if you spend all your time in front of a terminal window, it's a great language to know. Essentially what you're doing whenever you're running in a terminal window is writing a script that bash is running.

I'll give a short example from my work. I happen to work at a popular search engine, and sometimes it's handy to come up with lists of the top 100 queries in each individual language. So, if I happen to have a directory that has a whole bunch of files in it that contain lists of queries, and I want to count up all the queries and tabulate them into top 100 lists, I could write the following:

cd query_lists
for file in *; do
  sort $file | uniq -c | sort -rn | head -100 > $file.top100
done
Since I've written so much bash, I don't even have to think to write that. But it saves me a lot of time over typing:
sort english | uniq -c | sort -rn | head -100 > english.top100
sort japanese | uniq -c | sort -rn | head -100 > japanese.top100
sort german | uniq -c | sort -rn | head -100 > german.top100
sort esperanto | uniq -c | sort -rn | head -100 > esperanto.top100
sort klingon | uniq -c | sort -rn | head -100 > klingon.top100
sort french | uniq -c | sort -rn | head -100 > french.top100
sort portuguese | uniq -c | sort -rn | head -100 > portuguese.top100
sort russian | uniq -c | sort -rn | head -100 > russian.top100
sort dutch | uniq -c | sort -rn | head -100 > dutch.top100
Not only that, but I don't accidentally forget some important language (oops... I left out Chinese...)

Another reason bash is a good language to learn is that it's a good glue language. It's easy to do things like I did up above and pipe the output of one program into another. So you can easily do something like sort a file, count unique occurrences that are next to each other, sort that list in reverse numberical order, take just the first 100 lines, and put the results in a new file. Try doing that in python... It's not hard, but it takes a lot more code (and therefore more opportunities for bugs).

So have a good time with your new bash skills...


acknowledgments: I'd like to thank Jongleur, fuzzy and blue, Delta-sys, Simpleton, and ariels for their suggestions and feedback.

Log in or register to write something here or to contact authors.