Simpson's Paradox refers to the reversal of the direction of a
comparison or an
association when data from several groups are combined to form a single group.
This is best explained by using an example:
(example partially taken from the text The Practice of Statistics, second edition, by David Moore and George McCabe)
Say there are two schools in Michigan - one is for business, the other is for law. Here are their admittance stats:
Law school:
- 100 males applied, 10 were accepted
- 300 females applied, 100 were accepted.
- Female admittance is at about 33% (100/300), while male admittance is only 10% (10/100)
Business school:
- 600 males applied, 480 were accepted
- 200 females applied, 180 were accepted
- Female admittance is at 90% (180/200), while male acceptance is at 80% (480/600)
Each school admits a higher percent of female applicants than male.
However, when the numbers are combined, here is what happens:
A total of 700 males applied to both schools, and 210 were denied.
A total of 500 females applied to both schools, and 220 were denied.
This brings the acceptance percents to 56% for females and 70% for males. When the statistics are combined, the schools appear to favor men.