Once upon a time, a man named Douglas Hofstadter wrote a book called Gödel, Escher, Bach: An Eternal Golden Braid. Among other things, it discusses Gödel's incompleteness theorem, i.e. that no system powerful enough to encompass truths about number theory is both consistent (every provable statement is either true or not true) and complete (every true statement is provable). The method of proof of this fact rubs elbows with the same fixed-point principle responsible for quines and Cantor's diagonal argument. Building up the background knowledge to prove this requires a significant portion of the book, with the proof finally coming together in the fourteenth chapter out of twenty.

One of the bits of knowledge that is covered is a description of a typographical system of number theory, called, unimaginatively, Typographical Number Theory, or TNT. After some exposition on how it works and providing a complete grammatical description, Hofstadter gives some translation exercises, without answers, for the reader, should they be so inclined to try and turn these number-theoretic (N) statements into their typographical (TNT) equivalents.

The exact details of the implementation of TNT are not important, so long as it is known that TNT can be and is rigorously defined while retaining its expressivity and, by extension, the expressivity of number theory. The most important fact about TNT is that the only mathematical operators besides equality that it possesses are addition and multiplication. Every other number-theoretic construction can and must be defined by creating ever more clever and convoluted mashups of variables and quantifications, and they can occasionally require a great deal of ingenuity.

Hofstadter's assignment, found in chapter 8 (page 215 of the 20th anniversary edition), is as follows:


A Few More Translation Exercises

And now, a few practice exercises for you, to test your understanding of the notation of TNT. Try to translate the first four of the following N-sentences into TNT-sentences, and the last one into an open well-formed formula.

All natural numbers are equal to 4.
There is no natural number which equals its own square.
Different natural numbers have different successors.
If 1 equals 0, then every number is odd.
b is a power of 2.

The last one you may find a little tricky. But is it nothing, compared to this one:

b is a power of 10.

Strangely, this one takes great cleverness to render in our notation. I would caution you to try it only if you are willing to spend hours and hours on it—and if you know quite a bit of number theory!


Portions of TNT relevant to the exercises will be explained as necessary, but as a general principle, the specific TNT-sentence which encodes the given N-sentence is not as important as a sufficiently simple description of the N-sentence such that it could then be implemented into TNT with relative ease. As it stands, currently, TNT lacks discrimination between remainders of divisons, which would be useful for, say, specifying whether a number is odd; nor does it have exponentiation to express powers, which would greatly simplify the final two exercises. As we will see in a moment, there is enough power in TNT to emulate these mechanisms, but not without great effort.


→ All natural numbers are equal to 4.

This statement is straightforward to write in TNT, as it is a simple concept. We can trivially rewrite this sentence as "For all natural numbers a, a equals 4," which corresponds directly to the TNT-string ∀a:a=SSSS0.

The things to note are that numerals are expressed as successors of zero, e.g. SS0 is the successor of the successor of 0, otherwise known as 2; and that the ∀ clause is a quantifier that can operates on one variable, which must be free in the expression it quantifies, and means that all numbers could be substituted into the formula to produce a true statement. It also has a counterpart, ∃ ('there exists'), which means that at least one number satisfies the equation. Also of note is that though this statement is clearly not true, as only one natural number really equals 4, that does not mean that we have written the formula incorrectly: it just means that when proving things in TNT, you will come across the negations of this sentence, instead of it. Not all well-formed statements are true!


→ There is no natural number which equals its own square.

Again, this is a fairly simple task, so long as you know that n² = n · n. The N-sentence can become "There does not exist an a, such that a equals the product of a and a," which is ~∃a:a=(a·a).

The notes here are that the tilde sign means negation, and it does make a difference where it is placed w.r.t. any quantifiers; and that for the purposes of disambiguation, any sum or product must be surrounded by parentheses, and though we know that ((a+b)+c) and (a+(b+c)) are equivalent due to the associativity of addition, you are required to select one when operating in TNT.


→ Different natural numbers have different successors.

This marks our first translation challenge. In order to discuss two different natural numbers, we know we'll need a variable and quantifier for each. Call them a and b: the relevant quantifier is ∀. If a and b are two different numbers, then we know ~a=b. Their successors, Sa and Sb are privy to the same (in)equivalence. The statement is then, "For all a and b, if ~a=b, then ~Sa=Sb." Hence, our TNT-string is ∀a:∀b:<~a=b⊃~Sa=Sb>.

Just as addition and multiplication have parentheses, so to do logical conjunction (AND, ∧), disjunction (OR, ∨) and implication (IMPLIES, ⊃) have angle brackets. These logical operators, along with the previously mentioned tilde, are carry-overs from a system of logical deduction introduced earlier in GEB, which exactly mirrors that of standard symbolic logic, though in a typographically rigorous fashion. Also, here the one notational inconvenience of GEB's TNT is evident: = binds tighter than ~, and is unbracketed, which may inconvenience those used to standard symbolic logic.


→ If 1 equals 0, then every number is odd.

The first clause of the implication in this statement is straightforward: S0=0. But how do we describe every number being odd? First off, what does it mean for a variable, say a, to be even or odd? If a is even, it means that there is some number which is half of a: "a is even" can be expressed in TNT as ∃b:a=(b+b). The two ways of stating that a is odd would be, a) to negate "a is even", or b) to say a is one more than a multiple of two. Both are equally valid: we will use the latter. In TNT: <S0=0⊃∀a:∃b:a=S(b+b)>.

Here begins the need to create intermediate statements to use to use in our transcription. It is fine to use pseudo-TNT in intermediate steps if it so aids your thought process, but remember, you are not allowed to make any definitions in strict TNT, if you are to adhere to its original purpose, which was to axiomatize standard number theory into a computably enumerable system. Also, a fact about the system: the quantifiers can be placed outside of the consequent of the implication, around the whole statement, if one so chooses; this doesn't change much but is arguably a little bit tactless, uncouth even. Care must almost always be taken to know exactly what is being quantified, and in what order: putting the ∃b before the ∀a would mean something different. You may only interchange two adjacent ∃s or ∀s, you cannot intermix the two.


b is a power of 2.

A first attempt at doing this might be to try and specify b as double something. However, this is a recursive approach, and would require one to already have a predicate for talking about powers of two, which leads to an unresolvable infinite regress. What is instead needed is a more direct approach. Let's examine a power of two, 2n. Decomposing 2n by addition leads nowhere fast, but if you examine its factors, you can see that that they are all themselves powers of 2. Notably, a simple property that they all have in common is that they are all even. We can use this as the hook for expressing that b is a power of 2, if we can prove that b having only even factors is logically equivalent to b being a power of 2. Looking at the prime factorization of 2n, it is clear that every prime factor is even, and any odd factor would lead to an odd prime factor, which isn't 2.

The discussion in the last paragraph has been excluding the number 1, with the tacit assumption that all numbers involved are equal to 2 or greater. However, 1 is both a power of 2, and a factor of every number, so care must be taken to handle this case. Either b equals 1, or every one of its factors greater than one is even. Equivalently, either b is 1, or for all SSa, the existence of a c such that (SSa·c)=b implies that SSa is even. If SSa is even, then by extension a is also even, so we can simplify the formula a bit. (Since the smallest natural number is 0, we can use SSa to denote some number that is 2 or larger, but must take care to remember that SSa is the number originally called for, not just a.)

To sum it all up, in TNT: <b=S0∨∀a:<∃c:(SSa·c)=b⊃∃c:a=(c+c)>>. Note how c was used twice, under two different quantifications: this is acceptable, so long as it is what was meant.


b is a power of 10.

This is the fun part.

Having learned what we did in expressing powers of 2, we may try an equivalent approach. Either b is 1, or all its factors are multiples of 10. There are, however, a large number of problems with this approach. First off, powers of 10 have both two and five as prime factors, so by definition, no number can have all its factors be multiples of ten. If you try to patch this up by saying that all factors 10 or greater are multiples of 10, you still run into problems: 25 = 5 × 5, for instance, is a factor of 100 = 10², but is not itself a multiple of ten, so you have only described 1 and 10. Further, 4 = 2² is less than 5, meaning you allow 20 unless you exclude only 2 and 5 from the list, and this does nothing to patch up the lack of all the other powers, anyway. The key is something more twisted.

What is required is a way to encode an arbitrary power of 10 in a non-recursive manner. An equivalent way to do this is by specifying a finite sequence that can be generated, like { 1, 10, 100, ..., 10n }. To generate this sequence (call it {yn}) in a constant amount of information, we need some initial values (1, in this case), a recursively applicable rule (ym+1 = 10ym for suitable m), and a place to stop, kindly provided by n itself. This is a constant amount of information, now, that is able to encode arbitrarily long sequences (meaning we can reach all finite powers of 10); however, it is still recursive, so we need a way to map these rules onto some TNT-expressible structure.

Our saviour comes in the form of the Chinese Remainder Theorem. By encoding the terms of the sequence as the results of some number, say k, when taken modulo a set of pairwise relatively prime numbers that we can easily index, gives us a relatively easy way to retrieve the members of our sequence. The easiest of these encodings to implement is to use the fact that there exist arbitrarily long arithmetic progressions of primes: the two numbers i and j, can encode a sequence of n+1 primes { i, i+j, i+2j, ..., i+nj }. This result is only as recent as 2004, however, and something more akin to the N-long sequence { 1+N!, 1+2(N!), 1+3(N!), ..., 1+N(N!) } given an N might have been the solution in mind.* We will use the arithmetic progression of primes, for ease of use, though N! and other nice sequences with pairwise coprimality can also be described in TNT.

A recapitulation: the sequence {10n} is encoded by a triple (i, j, k), in which each term is interpreted as k modulo i+nj—the sequence has the same length as the arithmetic progression of primes. k mod i = 1, and 10(k mod i+mj) = k mod i+(m+1)j for 0 ≤ m < n where n is both the desired exponent of ten and the length of the sequence. If a triple exists (it does not need to be found!—k, for one, will be incredibly huge, for even relatively meager values of n) such that b is the last term in the sequence, then b is a power of 10.

Well, almost. It would be convenient to say that b is a power of 10 if the above triple exists. However, it should be noted that it only means that an n exists such that b = 10n. This is relevant because the inductive rule used above only holds for 0 ≤ m < n—in particular, n is always at least 1, while, as with our 'powers of 2' case above, 1 is still a power of 10. So for b to be a power of 10, either there exists the above triple, or b=S0.

The main intermediate predicate we will be using is that which implements modular arithmetic. We can define some temporary notation MOD{k,n,a}, which is to stand for a being congruent to k mod n. This predicate, in TNT, is <∃a′:n=(a+Sa′)∧∃n′:k=((n·n′)+a)>, which also must ensure that a is the only number less than n which is congruent to k mod n. The reason for this is that a is then unique given k and n—if all k mod n was allowed, the sequence would encode not only {10n} but also other numbers.

So what does our string look like? First, we will define PRIME{n}, to aid in the high-level description of the TNT-string, and then make substitutions afterwards. PRIME{n} is ~∃d:∃d′:n=(SSd·SSd′); in other words, for a number n to be prime, there cannot exist any pair of numbers, each greater than two, which multiply to make n.

The formula ultimately looks like this: ∃n:∃i:∃j:∃k:<∀n′:<∃n″:n=(n′+n″)⊃PRIME{(i+(n′·j))}>∧<INDUCTIVE∧<BASEFINAL>>>. The three predicates dealing with the sequencing have also been abstracted out, and are listed below.

  • The INDUCTIVE predicate encodes the inductive step on the sequence y. (Note, n′ never equals n, i.e. the prime i+nj does not require induction.)
    ∀n′:<∃n″:n=(n′+Sn″)⊃∃y:<MOD{k,(i+(n′·j)),(10·y)}MOD{k,(i+(Sn′·j)),y}>>
  • The BASE case is relatively simple, and ends up being a degenerate case of MOD where there doesn't have to be a check.
    MOD{k,i,1}
  • The FINAL case is the bit that connects the last item of the sequence to b.
    MOD{k,(i+(n·j)),b}

After a tedious bout of making substitutions, we have the final product.

<b=S0∨∃n:∃i:∃j:∃k:<∀n′:<∃n″:n=(n′+n″)⊃~∃d:∃d′:(i+(n′·j))=(SSd·SSd′)>∧<∀n′:<∃n″:n=(n′+Sn″)⊃∃y:<<∃y′:(i+(n′·j))=((SSSSSSSSSS0·y)+Sy′)∧∃m:k=(((i+(n′·j))·m)+(SSSSSSSSSS0·y))>∧<∃y′:(i+(Sn′·j))=(y+Sy′)∧∃m:k=(((i+(Sn′·j))·m)+y)>>>∧<∃m:k=S(i·m)∧<∃b′:(i+(n·j))=(b+Sb′)∧∃m:k=(((i+(n·j))·m)+b)>>>>>


→ This ends the exercise.

If you can find a typo or logical error anywhere, please let me know.

Say it ain't so! There was a logical error in this writeup! I had previously suggested that the existence of a triple (i, j, k) encoding the sequence { 10n } up to n for arbitrary n as described above adequately covers all powers of 10. This is incorrect because it only holds for n > 0, which is to say it excludes the otherwise valid n = 0 case, where 100 = 1. This error is corrected as of 10 July 2013.


* The astute reader will notice that 10n > 1 + n·n!—it is worth pondering how this might be patched up. Another curiosity is how to express n! at all.