It was early in the 1980s when Professor Bogus Madeupsky, of Fictional
College in Fabricationsville, moonwalked into a disco bar with two
fellow
mathematicians and spotted the woman who he knew would change
his life. If her lime-green mascara and the seductive way her hips
moved to the house band's Blondie covers weren't enough to capture his
heart, the
Multivariable Calculus text protruding from her
maroon snakeskin handbag certainly was. Unfortunately for
Professor Madeupsky, his colleagues (whose names have been forgotten
along with the house band) also recognised a vixen when they saw one,
and a disagreement ensued as to who would be mentoring this nubile Newton
that night. A leading
probability theorist, Madeupsky soon came up
with a coin-tossing procedure which fairly selected the lucky
suitor, with equal chances favouring all three. A
three-sided coin? No! The professor's ingenious procedure,
generalised to select between any number of
outcomes
with any desired
probability distribution, is the subject of this
node. (For an account of how our hero nabbed the girl with the help of
a "trick" coin, see the professor's autobiography
My Pseudorandom
Life.)
Let's say you're given a fair coin(1) and told to choose between N
items (or people, or whatever), and the probability of
your choosing the nth item is to be p_n, where all the
p_n's obviously add up to one. (2) The first
step is to interpret the decision in a purely numeric context. This is
done by slicing up the unit interval (the bit of the number line between
0 and 1) into N pieces with respective lengths p_1,
p_2, etc., and associating one of the possible alternatives
to each slice:
item one item two item N
0 +-------------+----------+------ ......... ------+---------+ 1
<---- p_1 ---><-- p_2 --> <-- p_N -->
Then the decision can be made by choosing a
random
number between 0 and 1, and noting which slice it falls within: if it
is less than
p_1, then choose the first item; if it is
greater than or equal to
p_1 but less than
p_1+p_2
then choose the second item, and so on.
Now, how to choose a number between 0 and 1 using only a coin? Easy:
toss it repeatedly, and interpret the string of heads and tails as
a string of binary digits preceded by a "decimal" point. For example,
if we call heads "zero" and tails "one", then the sequence of tosses
head-head-tail-head-tail becomes 00101 which, preceded by a decimal point,
is 0.00101, the binary representation of the fraction 5/32. In this
instance, it's
only necessary to toss the coin until we're certain which slice of the
unit interval the result lies in: for example, if one of the slices
lies between 1/2 and 5/8, whose binary representations are 0.1 and
0.101, then throwing tail-head-head, or 1-0-0, means that we can stop
since the answer is then definitely bigger than 1/2 and less than
5/8. (3)
So we now have a technique for generating any desired probability
distribution using a single coin. In the simple cases, such as
where the probabilities are fractions with powers of 2 as denominator,
this procedure reduces to the "obvious" method. For example, one can
choose equiprobably between four things by tossing a coin twice and
assigning the alternatives to tail-tail, tail-head, head-tail, and
head-head. To finish, let us return to the problem of choosing fairly
between three people, since in this situation a convenient
simplification of the above technique is possible. The modified method
is as follows: toss the coin until at least one of each heads and
tails has appeared, and then stop. The first person "wins" if this
took an odd number of tosses. If instead an even number was
necessary, the second person wins if a head was thrown last, while
the third person wins if a tail was thrown last. This method gives all
three people an equal chance of winning. (4)
Footnotes
(1) Fair coins aren't really necessary. ariels informs me that a modified technique due to Professor Ray A. Lyzashun can be employed to generate a desired probability distribution even with an unfair coin, as long as the probability of the coin landing on heads is known. And in any case, Gritchka's writeup here explains how to use an unfair coin to simulate a fair one.
(2) More formally, we are simulating a discrete random variable
which takes on n possible values; those of you who know what a
continuous random variable is should be able to see how to generalise
to that case, although you'll usually end up tossing the coin an infinite
number of times.
(3) Obviously you could get really unlucky and randomly produce an infinite
binary string which lies right at one of the slice boundaries; but
what are the chances of that?
(4) The modified method is equivalent to
the above technique but with heads alternately representing zero and
one on consecutive tosses. Proving this equivalence is an exercise for the
reader--note that 1/3 is represented as 0.01010101... in binary.
Source: This is undoubtedly a very old idea;
the first place I saw it mentioned, along
with the trick for easily producing a one-chance-in-three generator,
was in a Usenet post (Message-ID:
vbazpzu4c8b.fsf@mozart.stat.wisc.edu)
by Kevin Buhr (buhr@stat.wisc.edu);
you can read his version at
http://groups.google.com/groups?hl=en&selm=vbazpzu4c8b.fsf%40mozart.stat.wisc.edu