A
selection bias (
aka Selection Effect) is similar to the dreaded scientific sin of
Selecting on the Dependent Variable, but not quite. It is typically a flaw in
research design that
manifests in a
skewed set of study subjects. Put in less priggish English, it works as follows:
Suppose you have been handed a study assignment by your advisor. S/he would like you to perform an inquiry into voting patterns; to wit, you are to find out if apathetic people vote more conservatively than energetic people. Okay, say you, and you roll up your sleeves to head for the library and the datacenter to grab several of those Godawful only-distributed-on-magtape-older-than-your-great-uncle-Herman-who-smells-funny U.S. Census data sets. As you set this ancient spool of useless data to be sucked up and munched by your STATA workstation, you start to ponder: how to do this?
Well, one answer would be to check the correlation between election outcomes (conservative vs. progressive) and -lucky you, perfect!- a questionnaire distributed by the Census. It asks people to rate themselves as apathetic or energetic. Now, holding aside the problems of self-rated data (do you trust those people not to lie?) you have the following.
On one independent variable x, you've coded the result of the elections for, say, one hundred random counties across the U.S. On the other, y, you've coded the degree to which each of those counties rated itself as 'apathetic' or 'energetic.' You intend to crunch these numbers to find the degree of covariance, and the degree of confidence r2, in these numbers.
See the problem yet?
See, you don't know who voted. By the nature of the question, the apathetic people are more likely to have not voted, preferring to laze around the house. If that's true, then the energetics are overrepresented in the election results, even if the county is mostly apathetic. By studying 'voters,' you've 'selected' for energetic people.
And this, my friends, is the problem.
Now, I am aware how much my rusty, sucky stats knowledge bites. So if you find errors in this, please, /msg me and let me know! I may, in fact, have this entirely wrong! Heh.