13 12 2 4 8 4 15 14 14 8 0 15 5
4 1 16 1 18 5 12 5 6 19 13 5 19 18 4 5 3 0 19 16 3 3 4 7 19 18 1 7 18 16
13 2 18 9 4 17 14 17 3 1 16 16 6 15 14 11 1 18 11 1 14 14 5 19 1 4 17 3
12 16 19 5 19 18 15 4 15 10 1 19 11 18 15 18 14 10 9 15 9 0 16 4 14 1 3
16 6 1 19 19 17 18 5 16 17 0 0 12 11 2 12 2 0 7 0 14 18 9 10 8 9 7 12 4
8 16 0 15 18 19 15 15 18 0 12 15 1 13 8 12 15 0 15 15 8 15 10 6 4 0 15
14 7 8 18 16 4 19 12 2 18 7 18 16 8 10 11 10 3 19 3 19 0 18 15 8 13 5 15
18 6 10 13 13 18 12 10 3 11 2 5 9 10 4 6 19 14 18 10 18 18 13 17 18 12
12 7 6 18 2 5 4 12 18 18 10 11 8 13 2 11 19 12 1 3 19 1 18 17 11 17 15
4 15 14 17 8 1 3 6 3 8 10 16 6 8 7 18 17 1 0 8 0 13 9 4 12 10 3 9 2 1 5
7 16 0 4 4 1 7 10 5 16 1 1 3 10 8 1 7 9 1 15 10 14 5 15 7 16 19 16 18 0
2 5 17 2 10 1 4 17 12 9 13 14 11 16 4 0 18 11 9 19 7 0 14 12 16 1 8 15
18 7 15 1 12 12 3 2 14 8 0 7 17 14 1 9 11 6 9 9 18 18 9 5 19 3 18 15 5
6 11 4 13 7 5 6 0 9 9 14 18 9 2 15 3 3 4 14 10 13 4 8 12 13 13 12 17 11
8 3 18 19 7 12 7 13 18 7 3 7 2 1 17 4 17 1 8 2 16 18 16 0 6 9 13 0 2 11
12 10 14 11 10 2 3 17 15 2 5 19 10 7 0 7 12 18 9 0 0 5 19 17 5 5 6 19 6
8 10 18 19 5 9 9 7 13 7 3 16 12 2 6 19 3 14 12 2 3 12 3 8 11 0 14 17 7
13 4 16 4 2 15 10 12 5 17 6 12 0 2 4 3 8 4 7 3 16 9 6 9 12 15 1 12 9 19
0 3 3 16 8 6 11 18 19 16 16 5 9 17 7 13 0 16 18 8 19 14 17 6 3 9 2 5 2
12 4 2 15 7 18 4 13 10 3 13 7 19 18 16 17 6 9 18 2 8 6 2 2 3 8 6 13 10
11 15 2 16 18 18 3 17 3 17 7 6 10 14 6 9 10 3 15 0 1 17 8 8 0 11 11 8
18 5 19 10 1 2 6 19 1 10 17 4 8 4 11 19 18 18 8 9 1 4 10 3 1 18 11 1 10
3 10 8 9 10 18 10 12 4 10 14 15 7 18 3 11 10 3 10 8 11 0 10 15 10 13 17
9 5 19 19 9 10 7 18 0 5 8 12 10 19 6 6 6 5 9 18 15 13 9 4 4 10 14 0 0 7
18 9 13 17 9 2 7 16 0 7 2 9 0 13 8 7 19 15 13 9 14 9 2 3 13 7 14 8 8 14
15 6 4 9 3 13 11 11 10 12 19 13 1 0 7 9 8 6 5 1 16 19 10 19 3 4 7 17 12
15 12 8 1 16 17 5 10 9 17 1 1 16 14 2 17 1 12 5 8 17 6 5 16 17 4 0 2 11
17 14 6 9 3 8 6 1 14 16 10 11 18 12 7 12 15 4 14 7 9 3 5 16 8 2 14 13 2
16 4 19 11 11 9 14 0 15 15 14 11 6 5 10 19 13 2 14 18 17 1 8 0 6 5 9 8
19 2 11 15 7 10 6 19 19 1 19 14 17 14 6 3 0 16 3 13 19 17 12 17 18 0 18
5 5 7 14 4 10 5 0 17 16 6 17 16 8 17 11 5 11 18 9 11 15 12 5 14 9 17 12
7 18 10 13 4 18 7 9 8 13 9 5 10 16 3 6 4 0 17 10 11 15 19 3 11 11 9 5 0
7 18 8 6 8 2 10 6 9 19 14 3 9 0 13 6 3 19 11 4 17 1 15 13 0 19 4 12 8
10 12 16 8 1 2 17 3 12 3 13 12 18 16 2 19 10 8 3 10 0 7 7 1 3 1 2 2 5
14 11 16 7 8 5 8 10 2 12 3 6 5 15 4 2 18 4 12 7 7 3 7 14 10 9 18 12 11
0 18 5 12 14 13 0 19 2 11 2 14
"Anyone who considers arithmetical methods of
producing random digits is, of course, in a state of sin"
--John Von Neumann
These 1000 numbers ranging from 0 to 19, were produced with this simple
Perl script:
#!/usr/bin/perl
for($i = 0; $i < 1000; $i++){
print int(rand(20))," ";
}
At a first glance, these numbers seem random. But it is well-known
that no number sequence produced by a computer can be truly random.
It is very easy to observe that a specific number sequence is not random
(say, the sequence 2, 3, 5, 7, 11, 13...) but the opposite, that is, to
certify that a given sequence is really random, is a hard job. What we
are trying to do when we produce "random" sequences with our computer,
is to try to produce a sequence that has many of the properties of
really random sequences. That's why these sequences are usually referred
to as "pseudorandom sequences".
The three most important properties that a good pseudorandom
sequence must hold are:
- It looks random. This means that it passes all the statistical
tests of randomness that we can find.
- It is unpredictable. It must be computationally infeasible to
predict what the next random bit will be, given complete knowledge of
the algorithm or hardware generating the sequence and all of the
previous bits in the stream.
- It cannot be reliably reproduced. If you run the sequence
generator twice with the exact same input (at least as humanly
possible), you will get two completely unrelated random sequences.
(taken from Bruce Schneier's Applied
Cryptography)
Very often, it is next to impossible to say which exactly properties
of random numbers are vital for a specific application. But on the
other hand, it is always a good idea to run some kind of checks to a
pseudorandom number generator, so that we can be sure that, at least,
no degenerate sequences are produced. There can be generators that are
really good, but a flaw in the design can often result to a very bad
pseudorandom number generator.
There is a variety of checks that exist to test the properties of
pseudorandom number sequences, and most of them are based on
mathematics and statistics. In this writeup we examine the X2
statistical test (chi-square test, also mentioned in Bruce
Schneier's Applied Cryptography, second edition, page 45) on random
sequences. The basic idea of the X2-test is to check whether
our numbers are "logically" distributed. That is, we make a comparison with the sequence we have against a random sequence. So, the chi-square test is a goodness-of-fit test (thanks to vuo for reminding me this). If we produce N positive
integers, smaller than r, then we would expect to have N/r
numbers for each value between 1 and r. But the frequencies
of appearance for each number should not be exactly equal: That would
not be random!
It is proved that it can be checked whether a number sequence is
equally good distributed as a real random number sequence or not, using
the following formula:
Σ ( fi - N/r )2
0 ≤ i ≤ r
X2 = --------------------------
N/r
This is the X2-test, and if the value of X2 is
around r, then the numbers are random-like distributed, else they are
not. "Around" is defined as X2 being in the range r ± 2√r.
This is enough for a simple test of randomness, and it must be noted
that first, the test is OK for N > 10•r, and just to be sure about
the randomness of our pseudorandom number generator, the check must be
run a couple of times with different pseudo random sequences.
So, for the 1000 numbers at the beginning of the writeup, the X2-test yields X2 = 15.04 which is within the range 20 ± 8.94.
That means that the random number generator embedded into Perl is
worth its money, thus defeating what they say, that free stuff is worth
what you pay for ;-)
Of course, for applications that base their security on how really random a number sequence is, like cryptography, something more than the
Perl or C embedded random function is needed. Or else 'weird
correlations' will start to appear, and some cryptanalyst somewhere
will be veeery happy about it.
Bibliography
Algorithms: Design methods and complexity calculation, Foto
Afrati, Giorgos Papageorgiou
Applied Cryptography, Bruce Schneier