Introduced in Alan Turing's 1936 paper On computable numbers, with an application to the Entscheidungsproblem, a universal Turing machine is a mathematical idealisation of a general purpose computer. Able to act, with appropriate input, as literally any other possible Turing Machine, Turing's invention, essentially the concept of a general purpose cpu executing a stored program, was probably the largest single step taken in the development of the computer, and is often regarded as the start of computer science proper.

A Turing machine (TM) consists of a tape, a head which can mark and erase the tape, and a set of states. Depending on whether the tape is currently marked, and which state is occupied, the TM will erase or mark the tape or not, and move it one square left or right, at which point the next state kicks in. Additionally, there is a state which causes the TM to halt, if it is reached.

The tape is considered to be of arbitrary length and composed of discrete units which are accessible to the head in strict order, singly and wholly - that is the tape is an idealised one-bit erasable paper tape which never stretches, breaks, folds, runs out, or breaks other rules which are harder to think of.

The critical thing is that though the tape may be arbitrarily large, each step of the operation of a TM is completely determined by a finite number of simple and unambiguous rules. It is completely mechanical in its operation, and always behaves in the same way for any particular state and input.

These rules defining a TM (the set of states) can be written out in a standard form as marks on a tape. The interpretation of such an on-tape representation of a TM is then a mechanical procedure which can be realised by some TM with a suitable set of states.

A universal Turing machine (UTM) is a particular TM so constructed that its tape can encode any TM whatsoever, with the guarantee that the UTM will then do just what the encoded TM would do.

Suppose we have a machine M, then its output with initial tape t can be written M(t). Then a UTM U is a TM such that:

for all outputs Mi(tj) there's some ei,j such that U(ei,j) = Mi(tj)

We'd call ei,j the encoding of Mi(tj).

It's also required that the UTM can recognise input that is not a valid encoding of a TM and produce a predetermined response when this occurs.

Turing proved the existence of such UTM's by specifying one in his paper - it turned out not to be very complex - and showing it had the characteristic required, of replicating the behaviour of an arbitrary TM which is encoded on its tape. This is the essence of the modern computer, that given sufficient storage it can carry out an arbitrary program, encoded into some specific "language". The choice of a particular UTM defines a particular language.

Turing's insight was that an algorithm, when encoded, is just so much data that can then be operated on by another algorithm. The idea of encoding a TM as input for execution by a UTM is pretty much all you need for the general idea of a computer program.

The fact that a UTM can emulate any TM at all makes it easy to establish fundamental equivalences between various computational methods. If a particular method can produce a UTM, then it's obvious it can compute anything computable by an arbitrary TM. Such a formalism or language is said to be Turing complete. Specifications for UTM's have been written in formalisms as diverse as XSLT, sendmail.cf and cellular automata such as Conway's game of life.

This property of universality shifts the competition from what can be computed to the number of steps and amount of input required. No matter how featureful, elegant and concise the programming language you construct, whatever computations it can perform can be done in sendmail.cf or brainfuck.

Universality has been of interest to some heterodox physicists, such as Ed Fredkin and Steven Wolfram. Fredkin, on a suggestion of Feynman's, has been investigating the possibility of using cellular automata as a physics model and suggests suitable automata must be both universal (i.e. Turing complete) and reversible. Wolfram (also big on CA) sees in the UTM an upper bound to the complexity of the makeup of the universe. David Deutsch has proposed that "every finitely realizable physical system can be perfectly simulated by a universal model computing machine operating by finite means", and has attempted to extend the idea of a UTM to quantum computing.

Mathematician Gregory Chaitin has used the UTM as a building block in his algorithmic information theory, refining the notion by specifying that the encoding for the TM's must instruct the UTM how long they are (Chaitin says they are 'self-delimiting') and using them to define the algorithmic complexity of a string relative to a given UTM - the length of the shortest input that will cause the UTM to output that string - and to formulate his bizarre constant Omega - the probability, for some self-delimiting UTM, that it will halt with random input. Chaitin imagines flipping a coin to determine the state of each successive bit of the unread tape, as the UTM reads in its program. It's required to be self-delimiting so that the UTM knows when to stop reading and Chaitin knows when to stop flipping coins.


http://plato.stanford.edu/entries/turing/

XSLT UTM:
www.unidex.com/turing/utm.htm

sendmail.cf UTM:
http://okmij.org/ftp/Computation/sendmail-as-turing-machine.txt

Gregory Chaitin, Foundations of Mathematics at:
http://arXiv.org/html/math/0203002

For Fredkin, see:
http://digitalphilosophy.org