bounds give bounds for Poisson trials
. Considers an event
which happens with probability
p. Let X(n) be the number of events
that happen in n trials
. For each of these trials, the event happens with probability p, so the expected value
of X is np, and follows a binomial distribution
. However, this does not give strong indications on the distribution. Chernoff bounds provide that information.
Chernoff bounds say:
P(X ≥ (1+x)np) < e-x2np/3
P(X ≤ (1-x)np) > e-x2np/2
Conceptually, these bounds say that the number of events that actually occur is close to the expected number. Consider the case of the unemployement survey. You pick 1000 people and they say they are unemployed with probability p (presume p = 6%). What is the probability that you say the unemployment is 6.6% or more based on the results of your survey? P(X ≥ (1+.1) np) < e- 0.12np/3 = e^-.01 1000 0.06/3 = e-0.6 ~ 54.9%. If you want accurate results, you'll need to ask more people (*). If you ask 10000 people instead, the Chernoff bound goes down to 0.25%.
These bounds are used quite often in the complexity analysis of randomized algorithms. The (relative) simplicity of these formulae allows the design of algorithms by careful selection of x.
These bounds are also useful for bounding error distributions on experiments and surveys (as in the example above).
Chernoff bounds in general involve probability tail bounds. Herman Chernoff introduced the formulae above in his 1952 paper "A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations". The bounds are derived from the moment generating function.
The later inequality can be generalized to deal with non-uniform Poisson trials by replacing np with u, the expected value of X. I do not know a simple form of the similar generalization of the first inequality.
(*) Strictly speaking, just because the bound gives a high upper bound does not mean the value is large. However, if you ask enough people, these bounds will prove that the probability of error is small. Consult confidence interval for more information on survey error.