Selection Bias happens when the sample is not representative

When testing an idea (e.g., a new product or process), conventional wisdom holds that it is important to start small and then to scale up if the idea succeeds. But an often-overlooked risk is necessarily associated with this approach: the sample that is tested in the “start small” phase may not be representative of the entire population within the intended scope of impact of the idea.

To truly achieve widespread impact, it’s not enough to understand how your current customers or audience differs across geographies, demographic groups, and so on. You also need to think about how your current audience might differ from your future one.

Put another way, is the initial audience—or test subjects, or market segment—that yielded your early success a representative snapshot of the larger group of people whom you hope to serve at scale? When looking at results in the early stages of any enterprise, you must check that you’re correctly gauging what scientists call the representativeness of population.

Non-representativeness can occur either accidentally or through willful selection of the sample. When it occurs accidentally, it is a phenomenon known as selection bias—when people opt in to programs in a non-random way. This is problematic because people who choose to participate in a pilot program or study are the most likely to benefit.

The big-picture lesson here is one you ignore at your own peril: when assessing early responses to your idea, look under the hood and make sure the people in that group are representative of the larger population you ultimately hope to reach. To uncover true actionable knowledge, it is important to recognize heterogeneities rather than hide them.[1]


#mathematics #bias

See also:


  1. The Voltage Effect – List (2022), ch. 2, § “From Selection Biases to WEIRD People” ↩︎