"I should like to eliminate once and for all the questions regarding the foundations of mathematics, in the form in which they are now posed, by turning every mathematical proposition into a formula that can be concretely exhibited and strictly derived, thus recasting mathematical definitions and inferences in such a way that they are unshakable and yet provide an adequate picture of the whole science." - David Hilbert

# Description

A Turing Machine (TM) is an imaginary machine, capable of performing computations.
It is generally used as a theoretical tool to prove points about computational problems.
More on that later. One can think of the Turing Machine as a computer.
It is neither a very fast nor a very efficient computer, but, because of its simplicity, it is a practical tool for the analysis of more complex, real life computers and computing problems.
Many real-life implementations of Turing Machines do exist an can be found on-line as Java applets.

A general description of the architecture of a TM is as follows: the TM is defined as a machine with an infinitely long tape and a read/write head.
The tape is divided into squares or regions of equal size, that can contain exactly one symbol (possibly blank).
The choice of symbols is up to whomever defines the TM.
For example, one might use the Latin alphabet or perhaps the numbers 0 and 1.
The read/write head is always on top of one of the squares.
It can move at most one square to the left or to the right.
Aside from these physical properties, the machine always has an internal 'state', which usually has a number (q0, q1, q2, etc..).
At any time, the TM is in one specific state.
The use of this architecture and the state concept will become apparent later on.
For now, it is important that there is one 'halt' state that will stop the machine.

The machine is only capable of performing one elementary operation on the tape.
This operation consists of four steps:

- Read the symbol on the tape under the head.
- Based on the symbol read and the current state, overwrite the square with a new symbol (possibly the same symbol).
- Based on the symbol read and the current state, go to a new state (possibly the same state).
- Based on the symbol read and the current state, go one square left or right.
The 'current' state here being the state at the start of the operation, not the 'new' state.

Later on, we will see how it is possible to compute things with the Turing Machine.
For now, it is of the utmost importance to understand that the TM is not complete without a set of

transition rules.
If you have read the above description of the elementary operation carefully, you will have noticed that the new symbol, the new state and the new position are all dependent on the current state and the current symbol.
This means that the behaviour of the TM is governed by a set of rules.
These rules define a new state, direction and symbol for all possible current state/symbol combinations.
If we call the state-space Q, and the alphabet used Σ, these rules are in essence

functions from (Q,Σ) to (Q,Σ,D), where D is a direction: <- or ->).
As a final note, input into a TM is performed by pre-writing the tape with some symbols, before starting the machine. The output is what is left on the tape when halt is reached.

Intuitively, the list of transition functions is the 'program' and the Turing Machine is the computer.
For historical reasons, which we will see later on, the rules are however considered part of the machine itself.
Not a separate program.
So, infinitesimally many Turing Machines exist and the tape and read/write head alone are not a complete Turing Machine.
There is another good reason for not calling the rule-set a program.
We will also see that later on.

# Historical background

In the early twentieth century, mathematicians had an assumption about the power of mathematics that is still persistent amongst non-mathematicians today.
Mathematicians basically assumed that, given unlimited time and mathematical skills, it would be possible to solve any problem.
In other words, theorists assumed that if an assertion was true, then it would always be possible to prove that it was true.
To give an impression of how deeply engraved this belief was, the famous mathematician David Hilbert once said:

"If mathematical thinking is defective, where are we to find truth and certitude?"

This persistent idea eventually gave rise to two very important, yet simple questions:

- Is there a sequence of very simple, formal steps (i.e. a 'method') to prove any true assertion in finite time?
- Is every true assertion indeed provable?

These questions are a slight restatement of two questions formulated by Hilbert. They are known as the Hilbert Programme. Since it did not even occur to Hilbert at that time that either of these questions might be unanswerable, he thought of them as the last two formal bumps between mankind and a perfect mathematical system. Too bad..

The first of these questions, known as the Entscheidungsproblem (Decision Problem), is interesting, because it would allow us to build a machine that can systematically check all assertions and return a 'true' or a 'false'.
In essence, any mathematical theorem could be put into this machine and it could then be proven.
Fermat's Last Theorem for example.
It is obvious that such a method would have propelled us into an everlasting Golden Age of mathematics.
The second of these questions was considered by Hilbert and his peers to be an annoying little formality.
Everybody knew this had to be true, but a nice proof had just been evasive until then.

In 1931, Kurt Gödel struck a terrible blow to the mathematical community: he found a proof alright.
He proved that Hilbert's assumptions were wrong!
Gödel first proved that there are truths in mathematics (Theorems) that cannot be proved.
More horrifying was the fact that he proved that this problem could not be solved, because it was not a flaw of our mathematical system.
It was a fundamental flaw of any imaginable mathematical system.
Gödel showed that a mathematical system, any system, is either incorrect or incomplete.
This means that, if you create a mathematical system, there are either truths that cannot be proved within that system or there are things that are not true that can, incorrectly, be proved.
Gödel's Theorem deserves a node in itself.
As an overly bold simplification, it has to do with the fact that one can or cannot prove that "this statement is false".

Before Gödel delivered his striking blow, however, mathematicians looked at the first of these two questions.
This question can be rephrased as: "Is there a 'method' to prove things, that will always finish?"
For the word method, one can informally substitute 'recipe' or 'algorithm'.
In order to find such a method, it was necessary to define the concept of a method or an algorithm in a mathematical sense.
Mathematicians needed a mathematical model of the act of systematic problem solving.
A difficult philosophical question indeed: how do you make a model of a mathematician?

# The model of the Mathematician

The young and talented British mathematician Alan Turing had a thorough understanding of both the abstract and the real world.
His solution to the question posed above was not the only one ever devised, but is is generally considered to be the most elegant one.
It still has many useful applications in theoretical computer science today.

Turing literally made a model out of a working mathematician, sitting behind a desk, contemplating, writing down things on a piece of paper, looking up facts in notes, correcting earlier notes, etc.
Turing stipulated that the first of Hilbert's questions actually was:
"Can a person with an infinite amount of paper, erasers and time, prove any theorem?"
He then systematically removed all irrelevant details from this picture.
First, he figured that the fact that a piece of paper may contain more than one symbol is irrelevant.
If necessary, one could use a great amount of separate sheets of paper for each symbol.

Then, Turing figured that a mathematician drew conclusions from the symbols he or she had jotted down, in combination with some idea in his or her head (the 'state') and, on the basis of his/her mental state and the symbols on the paper containing earlier results, the mathematician could write down new results or alter old ones.
The mathematician should be able to go backwards or forwards in to his/her notes if necessary, and consult or alter them, based on the mathematician's state of mind.

To form a more concrete picture of this, consider a mathematician, yourself for instance, adding up two large numbers on paper.
Usually, you would start on the right side of both the numbers.
You first read the symbol on the right of one number an memorise that (you enter a state describing that number) and move on to read the rightmost digit of the other number.
Then, based on the digit you are remembering and the digit you are currently looking at (current state and symbol), you draw a conclusion as to what digit should be written under the dotted line.
If the numbers add up to something higher than nine, you should 'remember' to add one when you add the next two digits.
This means that you should be in the 'remember one' state before you proceed to add the rest of the digits.
Your mode of operation for the remaining digits will have to be different when you are in this 'remember one' state.

At this point, it should be obvious that the tape machine described in the first section is exactly the same as the mathematician sitting behind a desk. The 'current' square is the desk, the tape is the infinite supply of paper, the read/write head is the mathematician and the 'state' is the mathematician's mental state.

# The Turing Machine: a universal model of computation?

An obvious question to ask is wether this model of computation, the Turing Machine, is actually a universal model.
That is: can the Turing Machine be used to model all methods or, equivalently, is there a model that can solve more problems than the Turing Machine?
That question is very difficult to anwer.
But, as stated earlier, several other models were devised for the concept of a 'method' or algorithm.
These methods are, roughly, Lambda Calculus (by Alonzo Church), Partial Recursive Functions (Kurt Gödel and Stephen Kleene) and General Recursive Functions (Gödel, Kleene and Jacques Herbrand).
These four methods have all been proven to be equivalent.
Anything that can be proven with one of them can be proven with any of them.
This at least suggests strongly that they are indeed universal models of computation. Further, these methods are all self descriptive. One can make a Turing Machine on a Turing Machine. This property is called Turing Completeness. It is something any programming language should be tested for.

Since anyone with imagination can invent a model of computation and a concept of what computation is, this matter is by definition impossible to resolve.
One interesting note is the fact that, of these four methods, the Turing Machine is by far the most elegant model, because it is very comprehensible and little mathematical skill is required to understand it.
(Challenge to other noders: Try writing a node as comprehensible as this one about Lambda Calculus!)

# Type classification

There are different types of Turing Machines and different grounds on which to classify them. Here is a summary:

**Normal vs. Universal Turing Machines:**

As stated earlier, the transition rules are considered part of the machine itself.
That means that one Turing Machine is only capable of solving one problem (for instance, finding the prime numbers between 1 and 100).
Such a Turing Machine is called a Normal Turing Machine.
Early on, Turing wondered if it would be possible to construct a set of rules (that is, one Turing Machine) that could perform a function that was specified on the tape itself, instead of in the predefined functions.
In essence, one would encode a program on the tape, next to the input data, and this special Turing Machine would read the program and perform it.
This is indeed possible.
This type of Turing Machine is called a Universal Turing Machine, since it can perform all sorts of functions.
It is more like a stored program computer than the normal Turing Machine.
The existence of this type of Turing Machine is a good reason not to call the state transition functions a 'program'.
Note that this idea was very revolutionary, since ordinary computers did not exist in the 1930's.
Modern computers are, roughly, just sophisticated Universal Turing Machines.
This property makes Turing Machines so valuable for the general analysis of computing problems.

**Single vs. Multiband Turing Machines:**

Another distiction to make is one between Turing Machines with one tape and one head and a Turing Machine with more tapes and heads.
The single most interesting question here is: is a multiband Turing Machine able to compute more things than a single band machine?
The answer is, of course, no.
Otherwise, we would have proven that the Turing Machine is not a universal model of computation.
In fact, it has been proven that any multiband Turing Machine can be simulated on a single band Turing Machine with at most a quadratic increase in time complexity.
This is a great exercise for the reader!

**Deterministic vs. Non-deterministic Turing Machines**

A TM with exactly one transition rule for every given symbol/state combination is said to be a Deterministic Turing Machine.
The one that we will look at in the example below is such a machine.
When we use the TM as a theorerical tool, it is often practical to define TM's with more than one rule per symbol/state pair when modelling computations.
Imagine, for instance, a Turing Machine that explores all possible cycles in an undirected graph.
Such a machine might start in one vertex and decide to go either left or right.
To find all cycles, the TM needs to do both.
You can think of this as having two duplicate machines running off in each direction to find a cycle.
It has been proven that every Nondeterministic Turing Machine can also be modelled using a Deterministic Turing Machine.
NDTM's are only useful as a philosophical tool.

# Scientific applications

Since nobody actually uses Turing Machines to compute things, what are their practical purposes?
We already saw that Turing Machines are capable of performing the same functions as modern processors.
The advantage of the Turing Machine is that it is so simple to understand and that all steps take the same amount of time.
This property makes the Turing Machine an excellent tool for calculating the time complexity of algorithms.
(For a detailed description of time complexity, I gladly refer to my writeup there)

Aside from time complexity calculations, Turing Machines have one other important application.
Ever since Gödel shot down the idea that everything can be proved with a machine, people started to wonder what can and cannot be computed with a machine.
The Turing Machine is a theoretical aid in the determination of the type of problems that can be solved on computers, regardless of their architecture.
This is incredibly important to computer scientists, because it allows them to analyse beforehand if a given problem is at all solvable or not.
Needless to say, that is an interesting thing to check before one begins to write a program to solve such a problem.

# Example of a Turing Machine

To demonstrate and illustrate the concept of the Turing Machine, we will look at an example.
Hopefully, you will see what a simple and elegant mechanism it is.
We have already seen that a TM is defined by a set of transition functions from a state/symbol pair to a new state/symbol pair and a direction in which to move.
Using only these types of functions, we will define a Turing Machine that can increment any binary number by one and then halt.
The number that is to be incremented will be on the tape prior to the start of the machine and the head will be on the leftmost digit.
At the time the system halts, the head will be on the leftmost bit of the result, and the rest of the tape will be empty.

**Strategy outline:**

We will first direct the head to the rightmost bit, by scanning until we find a space and then going one square to the left.
Then, we will add one to the rightmost bit. If we have a carry bit, we will remember that and move on to the next digit to the left.
To that, we will add one in case we have a carry bit or we will leave it if there is no carry bit.
If there was already a one on the new position and we do have a carry, we will set it to zero and remember a new carry when processing the next digit. The details will be specified below.

**Notation:**
We will use the following notation for transition rules: (qn,s)->(qm,t,d). Here, qn is the current state, s is the current symbol, qm is the new state, t is the new symbol and d is a direction.
The direction is either L, or R (Left or Right) or - (no move). The symbols will be 0, 1 or _ for space.

The Turing Machine:

- (q0,0)->(q0,0,R) //q0 is the 'searching' state. If we read a 0 keep searching..
- (q0,1)->(q0,1,R) //If we read a 1, keep searching as well..
- (q0,_)->(q1,_,L) //If we find the first space, go left and enter the 'adding state' q1
- (q1,0)->(q2,1,L) //If we find a zero while adding, add one and go to the 'done' state q2
- (q1,1)->(q1,0,L) //If we find a 1, make it 0 and keep on adding to the left..
- (q1,_)->(halt,1,-) //If we find a space while adding, make it 1 and halt.
- (q2,0)->(q2,0,L) //If we are 'done' we keep looking for the leftmost bit.
- (q2,1)->(q2,1,L) //Same here..
- (q2,_)->(halt,_,R) //If we are 'done' and find a space on the left, we move right and halt.

..and that, as they say, is it!

# Additional information and suggested reading

** On Gödel's Incompleteness Theorem**

There are books abound describing the proof of Gödel's incompleteness theorem. The proof in itself is not very difficult, but formal logic requires very strict notation. This makes most literature a difficult read. A very informal and famous treatment is given in Gödel, Escher, Bach: An Eternal Golden Braid, by Douglas R. Hofstadter. The author introduces the concepts of Godel's Theorem by comparing them to the 'impossible' graphics of M.C. Escher and J.S. Bach's mathematically structured compositions.

**Turing Machine implementations**

Turing Machines have been implemented in many remarkable ways (usually making finite assumptions for the infinite tape). Java implementations are abound on the web. A particularly interesting implementation is the one within Conway's Game of Life. More about this can be found here.

**A note to the purists**

Gödel did not actually prove that *any* mathematical system is either incorrect or incomplete. Obviously, this node is not the place for a complete coverage, but to be somewhat more exact he proved that any Theory at least as complex as the theory of Peano arithmetics cannot be fully described by a finite number of axioms. Peano arithmetics is more commonly known as the theory of natural numbers. As you can imagine, almost all branches of mathematics are at least as complex as number theory. Understand that the above exactly means: given any finite number of axioms, there are truths that cannot be derived from this set of axioms, meaning that these truths are themselves axioms. An example of a theory that is both correct and complete is that of Propositional Logic. Fitch's system of Natural Deduction and the Tree Method can be used to prove all theorems within this system. In turn, these methods can be proven to be both correct and complete. The Prolog programming language is in effect a proof checker for theorems in Predicate Logic; an even more complex Theory.