In 1951, John von Neumann presented a method for removing all 0/1 bias from a random number generator. This method is often used as-is even today, though various researchers have come up with somewhat more efficient methods. It is especially common for dealing with hardware RNGs, as most software RNGs either aren't that picky (for example, most uses of C's rand() call), or use strong cryptographic algorithms, like SHA-1 and AES,
which will generate even bit distributions without any extra help. It is especially popular because it is very easy to code in software or hardware. I would guess it only requires a handful of gates, though I haven't tried designing one myself - if there are any EE types who would like to give it a shot, please either add a writeup or let me know, I'd love to see it.
Other terms for this include "von Neumann transition mapping" and "von Neumann unbiasing". One could think of these terms as referring to the algorithm itself, and a "von Neumann corrector" as an implementation of it. Calling it a "corrector" seems to be the most common way of referring to
The algorithm works on pairs of bits, and produces output as follows:
- If the input is "00" or "11", the input is discarded (no output).
- If the input is "10", output a "1".
- If the input is "01", output a "0".
In the last two steps, you can alternately use the last bit rather than the first as the output (ie, for "10", output a "0", and similarly for "01"). It is quite trivial to show that regardless of the distribution of 0 and 1 bits
in the input, the output will always have a 50/50 split of 0s and 1s. This is often given as a homework problem in probability or information theory courses in colleges, and having solved it at least two or three times now I'm rather weary of it. If you really want to see a proof /msg me and I'll put one in.
This method does have some flaws. In particular, it will discard (on average) 75% of all input bits, even if the original input was perfectly random to start with. However, often when one is dealing with a hardware RNG (where the majority of von Neumann correctors are found), this is quite acceptable - the output rates of many common hardware RNGs are in a range from a few kilobytes / second (a device attached to an RS-232 serial line) to a few megabytes / second (the RNGs built into some recent CPUs/chipsets, such as the i810). Most applications needing randomness of such a high quality as to warrant the use of hardware in the first place only need a relatively small amount of it, for use as seed material for statistically random deterministic algorithms. So, while it's not terribly efficient, most users aren't going to be bothered by this fact.
Also, while this method will remove all 0/1 bias, it does not remove many other forms of bias - for example, an endless string of "01010101..." will be happily accepted (producing "00000..."), and other more obscure problems will also avoid the corrector†. Thus, while a von Neumann corrector is a very helpful tool in the design of an RNG, it is not a pancea for all your random bit woes.
†: Well, C-Dawg has called me out on this one. He points out that if the output is going to be evenly distributed between 0 and 1 bits regardless of the input biases, how is is possible that "010101..." produces "000..." (which is obviously not 50% 1 bits)? There is one additional restriction of the von Neumann corrector, probably the most important one in practice. This is that the input bits are independent. Each bit can be individually biased as much as you like, but as long as the bits in each pair are independent the output will have a uniform distribution. One easy way of getting this independence is to take alternating bits from two different hardware RNGs.