This is the law:
1---------------2---------3-------4-----5----6---7--8--9
Or, the probability of the first digit in a number in statistical data being a "d" is
                  log_10( 1+ (1/d) )
The law was first discovered by a mathematician named Simon Newcomb in 1881, when he discovered that in book with logarithmic tables, the pages with lower numbers were more worn than the pages with higher numbers. But it took until 1938 when Frank Benford published the results from analysis of more than 20000 numbers from various sources; price lists, electrical bills and street addresses.

In 1996 professor Ted Hill suggested an explanation of this phenomenon. The law applies to large amounts of numbers that are derived from somewhere else, not on random numbers. If we take several sources of random numbers - which are all each distributed to a normal distribution or some other random distribution - and combine these numbers, the would be a distribution of distributions. This distribution is Benford's law, and it has most numbers in the lower part of the range.

Benford's law is commonly used by accounting firms and others who work with large amounts of numbers. If the numbers aren't tampered with, the first digit should be 1 in 30.1% of the numbers and 9 in only 4.6% of them. If the distribution is more even, then something is probably wrong... Of course, some numbers are by nature more common than others, such as amounts of $24, which happens to be the largest amount you can expense report in America, without having a receipt.

Another slightly counterintuitive fact from probability statistics is that improbable results do occur more often that you'd think. Theodore Hill used to give his students the following homework: Flip a coin 200 times and write down the results. Many of the students got tired after 20 flips, and just wrote down made-up results, that they thought would seem probable. The thing is, in a series of 200 flips of a coin, it is highly probable that there will be a series of 6 heads (or tails) in a row. The students that cheated rarely had more than 3 of the same in a row.


Graphic by Kevin Brown, http://www.seanet.com/~ksbrown/index.htm