One of the more surprising results in mathematics, and a good example of a very deep result which is nevertheless simple enough that a six-year-old can understand its statement.

First, a succinct explanation for the mathematicians in the audience:

limn->oo pi(n)ln(n)/n = 1

And for the novices: for a start, go here to find out what a prime number is. Then go here to learn about logarithms, in particular those to base e. Now, what the prime number theorem says is this: to find out roughly how many primes there are less than a certain number, divide the number by its natural logarithm. The bigger the number, the more accurate the answer you get will be; in fact (to be very imprecise and annoy the pedantic readers) as the number gets "infinitely big", the approximation becomes "exact".

Amazed? Well, if you're not then you should be. Why the hell should the prime numbers' distribution have anything to do with e, which is defined using concepts which seemingly have nothing to do with number theory? The answer is long, and if I ever understand it I'll attempt a writeup.

If we re-arrange the formula for the prime number theorem

limn->∞π(n)*loge(n) / n

We can obtain the more useful form,

π(n) ~ n / loge(n)

This tells us that the prime number density (the number of primes that exist) between 1 and n is asymptotic (i.e. approximately equal) to n / loge(n).

How good is this result? In reality, there are 25 primes less than 100, but π(100) = 21.715. This is a difference of about 15%. But, the larger the value of n, the better the approximation gets. For instance π(1,000,000)=72,382, where in reality there are 78,498 primes less than one million. Now the difference is only 8%.

Why does this work? Honestly, I don't know. The mathematics are beyond me, but I think it has something to do with the Sieve of Eratosthenes method of finding prime numbers. I can't prove it, though.

Reference: Goodaire, Parmenter Discrete Mathematics with Graph Theory 1998 Prentice Hall, Upper Saddle River, NJ.

We know that pi(k) is roughly equal to k/ln(k). Are there any related functions that lead to a new appreciation of the number of primes up to some given number? An approximation of ln(k) is the harmonic sum: (1/1)+(1/2)+(1/3)+(1/4)+ . . .

The harmonic sum is equivalent to zeta(n) for n=1, which is (again): (1/1)+(1/2)+(1/3)+(1/4)+ . . .+(1/k) as k approaches infinity. Note that this is not a convergent sum. But also note that the difference (1/1)+(1/2)+(1/3)+ . . . +(1/k)-ln(k) (the harmonic sum minus the natural log) approaches gamma as k approaches infinity, a constant often associated with Euler. In Riemann zeta function I mention another form of zeta(n), i.e.:

            ∞
          -----
           | |     1
zeta(n)=   | | --------
           | | 1-p^(-n)
            P
for p in P, the set of prime numbers. Thus, ln(k) is also related to the multiplication:

   1       1       1            1
-------*-------*-------*...*----------
1-(1/2) 1-(1/3) 1-(1/5)     1-(1/p(k))
where p(k) is the largest prime less than or equal to k. So another approximation for the number of primes up to some number k might be:

  2-1 3-1 5-1     p(k)-1
k*---*---*---*...*------
   2   3   5       p(k)
In fact, this is always an underestimate. It may still be an underestimate for all k if the above expression is multiplied by k. The multiplication seems to be intimately related to the Sieve of Eratosthenes, since the following process seems to apply:

Begin with k (or k2) consecutive integers, the smallest of which is 1.
Remove one from every two numbers (remove one from 1,2, another from 3,4, another from 5,6, etc.).
Remove one from every three numbers remaining.
Remove one from every five numbers remaining.
Etc. up to p(k).

The first step is like multiplying by 1/2. The second step is like multiplying by 2/3, the third like multiplying by 4/5, and so on. In the limit as k approaches infinity, k will have gone to infinity but the other terms will have multiplied together to get zero (since 1/zeta(1)=0). The number of primes *is* infinite, so this multiplication should be always increasing without bound.

One significant consequence of the prime number theorem, as it was proved by Charles de la Vallée Poussin and Jacques Hadamard in 1896, that π(n) ~ n/ln(n) is that the average density of prime numbers in the range of 1 to n must be 1/ln n, so asymptotically, the probability of some randomly chosen number in the neighborhood of n being prime is approximately 1/ln n. This might well be considered as an asymptotic probability density function for the prime numbers, so one can derive an approximate probability distribution function for the prime numbers (which is essentially another approximation to π(x)) by taking the integral of 1/ln n. That integral cannot be evaluated in terms of elementary functions, so it is a special function known as the logarithmic integral function Li(x). It can also be shown that as n-> ∞ Li(n), -> n/ln n, so another way of stating the prime number theorem would be to say π(n) ~ Li(n). While n/ln n is a lower bound, it may be shown that Li(n) is more or less an upper bound to the prime counting function π(n). Further, it may be shown that it is an asymptotically tighter bound on π(n) than n/ln n.

This statement of the prime number theorem is more common in analytical number theory, and the study of how much the approximation of Li(x) diverges from π(x) leads straight into the Riemann hypothesis and some of its consequences. Some initial results were achieved by John Littlewood in 1914, who showed that Li(x) is not a strict upper bound on π(x), but that at some point it would cross over and the difference Li(x) - π(x) must alternate back and forth between positive and negative. The point at which these Littlewood violations begin to happen is not known, but one of Littlewood's students, Samuel Skewes, showed that if the Riemann hypothesis is true. the first violations must come before ee^(e^79), a monstrous number which is about 1010^(10^34) that attained notoriety as Skewes number, the largest number to arise out of a mathematical proof at the time (right know I think Graham's number holds that dubious honor). In 1955 Skewes improved his results, this time without assuming the truth of the Riemann hypothesis, and found that Littlewood violations should begin occurring after 1010^10000. Further refinements were made in the following decades, of which the most recent is the work in 2000 of Carter Bays and Richard Hudson that showed that the most likely region for the first few violations must be in the vicinity of 1.39822 x 10316. Nobody has computed all the prime numbers up to that point just yet, and no one has actually proved the Riemann hypothesis yet either, so it's still a wide open question.

Log in or register to write something here or to contact authors.