Introduction

In 1936, a survey of more than 2 million Americans predicted the wrong presidential winner. During World War II, military engineers nearly reinforced the wrong parts of bomber aircraft. Both mistakes were caused by the same statistical problem: selection bias.

Data is often presented as objective, reliable, and factual. But even accurate data can produce misleading conclusions when the wrong people, events, or observations are included in the analysis. This is where selection bias comes into play.

Selection bias occurs when a sample does not accurately represent the population being studied. As a result, researchers, businesses, journalists, and policymakers may draw conclusions that appear valid but are actually distorted.

These selection bias examples show how this statistical problem appears in real life, from scientific research and business decisions to social media polls and financial markets. Understanding selection bias is one of the most important steps toward recognizing misleading statistics and becoming a more critical consumer of data.

ExampleType of BiasMain Issue
WWII AircraftSurvivorship BiasMissing failed cases
Literary Digest PollSampling BiasWealthy voters overrepresented
Social Media PollsVolunteer BiasSelf-selected respondents
Medical ResearchPublication BiasNegative studies missing

Quick Answer

Selection bias occurs when a sample does not accurately represent the population being studied, leading to conclusions that may be misleading or incorrect.

What Is Selection Bias?

Selection bias occurs when the data being analyzed comes from a non-representative sample of the population.

In simple terms, some people, events, or outcomes are included while others are systematically excluded. Because the sample is skewed, the conclusions based on that sample are often misleading.

Selection bias is one of the most common causes of misleading statistics. It can affect scientific studies, surveys, business reports, hiring decisions, medical research, and even everyday news stories.

Why Selection Bias Matters

The danger of selection bias is that it often goes unnoticed.

The numbers themselves may be perfectly accurate. The problem lies in who or what was included in the data.

When selection bias occurs:

  • Researchers may publish inaccurate findings.
  • Businesses may make poor strategic decisions.
  • Investors may misjudge risk.
  • Policymakers may create ineffective programs.
  • Consumers may be misled by statistics that appear trustworthy.

Understanding selection bias helps you evaluate data more critically and avoid being fooled by incomplete information.

Selection bias is one of the most common causes of misleading data in research, business, and media reporting.

1. Survivorship Bias in World War II Aircraft

One of the most famous selection bias examples comes from World War II.

The U.S. military wanted to reinforce bomber aircraft to reduce losses. Engineers examined returning planes and mapped where bullet holes appeared most frequently. Their initial conclusion was straightforward: reinforce the areas with the most damage.

Statistician Abraham Wald noticed a critical flaw.

The military was only studying planes that survived the mission. Aircraft that had been hit in more critical locations never returned and therefore never appeared in the data.

The missing planes were the key to understanding the problem.

Instead of reinforcing the areas with visible damage, the military needed to reinforce the areas where no bullet holes were observed on surviving aircraft.

This example illustrates survivorship bias, a specific form of selection bias where only successful outcomes are analyzed while failures are ignored.

World War II aircraft survivorship bias example showing bullet hole analysis on returning bombers
The famous World War II bomber study revealed how missing data can lead to incorrect conclusions.
Key Lesson

Always ask what data might be missing from the analysis.

2. Volunteer Bias in Surveys

Many surveys rely on voluntary participation.

The problem is that people who choose to respond are often different from people who ignore the survey.

For example, a survey about political issues may attract individuals with stronger political opinions. Likewise, a customer satisfaction survey may receive responses mainly from very happy or very dissatisfied customers.

The result is a sample that no longer reflects the broader population.

Volunteer bias is particularly common in online surveys, social media polls, and market research studies.

Key Lesson

People who volunteer to participate are rarely representative of everyone else.

3. The Literary Digest Election Disaster

One of the most famous failures in polling history occurred during the 1936 U.S. presidential election.

The magazine Literary Digest conducted a massive survey involving millions of respondents. Based on the results, the magazine confidently predicted that Alf Landon would defeat Franklin D. Roosevelt.

Instead, Roosevelt won in a landslide.

The survey itself was enormous, but the sampling method was deeply flawed.

The magazine collected names from telephone directories and automobile registration lists. During the Great Depression, telephone owners and car owners were disproportionately wealthy compared to the general population.

As a result, the survey systematically excluded many lower-income voters who strongly supported Roosevelt.

The sample size was huge, but it was not representative.

1936 Literary Digest poll selection bias example that incorrectly predicted the US presidential election
Even millions of survey responses can produce misleading results when the sample is not representative.
Key Lesson

A large sample cannot fix a biased sample.

4. Publication Bias in Medical Research

Medical research is often viewed as one of the most rigorous forms of scientific investigation.

However, publication bias can create a distorted picture of reality.

Studies that find positive or statistically significant results are more likely to be published than studies that find no effect.

Imagine that twenty studies test a new medication.

If only the three studies showing positive results are published, doctors and patients may believe the treatment is far more effective than it actually is.

This creates a selection effect in the scientific literature itself.

Readers only see part of the evidence.

Key Lesson

The studies you see may not represent all the studies that were conducted.

5. Customer Satisfaction Surveys

Companies frequently use customer satisfaction surveys to evaluate performance.

Unfortunately, the customers who complete these surveys are often not representative of the entire customer base.

Most people with average experiences simply move on with their day.

Customers who had an exceptionally positive experience or an extremely negative one are much more likely to leave feedback.

This means the results are often dominated by outliers rather than typical customers.

A company may believe its customers are either delighted or furious when most customers actually feel neutral.

Key Lesson

Extreme opinions are often overrepresented in survey data.

6. Startup Success Stories

Business books frequently analyze successful companies such as Amazon, Apple, Airbnb, and Netflix.

Authors identify habits, strategies, and decisions that supposedly explain their success.

The problem is that these analyses rarely examine the thousands of companies that followed similar strategies and failed.

By focusing exclusively on winners, we create a distorted understanding of what drives business success.

Factors such as timing, luck, market conditions, and competition may be far more important than many success stories suggest.

Key Lesson

Success stories rarely tell the whole story.

7. Social Media Polls

Social media polls are often shared as evidence of public opinion.

In reality, they are usually poor representations of the broader population.

The people who see a poll are already part of a specific online audience. Those who choose to vote are even more selective.

A poll conducted on X, Facebook, Reddit, or Instagram typically reflects the views of a highly engaged subgroup rather than society as a whole.

Yet these results are frequently cited as if they represent public opinion.

Selection bias in social media polls caused by self-selected online respondents
Social media polls often reflect the opinions of highly engaged users rather than the general public.

Poll results can become even more deceptive when combined with misleading visualizations.

Key Lesson

Social media users are not a random sample of the population.

8. Online Product Reviews

When shopping online, consumers often rely heavily on product ratings and reviews.

However, reviewers are not representative of all buyers.

People who are extremely satisfied or extremely disappointed are more likely to leave reviews than customers with average experiences.

In addition, some sellers actively encourage happy customers to submit feedback.

As a result, ratings can provide a distorted view of overall customer satisfaction.

Key Lesson

Reviewers are a self-selected group, not a random sample of customers.

9. Hiring and Recruitment Data

Many organizations analyze their highest-performing employees to identify characteristics associated with success.

For example, a company might notice that many top performers attended prestigious universities.

The company may then prioritize candidates from similar institutions.

The problem is that the analysis only includes people who were hired.

It ignores qualified candidates who were rejected and never given the opportunity to demonstrate their abilities.

This creates a powerful selection bias that can reinforce existing hiring practices.

Key Lesson

You cannot learn about rejected candidates by studying only successful hires.

10. Sports Statistics

Sports statistics often appear objective, but selection bias can still influence interpretations.

Professional athletes represent a tiny fraction of all individuals who attempted to compete at high levels.

The athletes who reach the professional stage have already survived years of competition, training, injuries, and selection processes.

Comparisons between athletes or eras can therefore be misleading if differences in selection conditions are ignored.

Key Lesson

Elite performers are not representative of the broader population.

11. Historical Records

Our understanding of history depends on documents and artifacts that survived.

The problem is that many records were destroyed, lost, or never preserved.

As a result, historians often rely on evidence produced by wealthy individuals, governments, religious institutions, and other groups with greater resources.

The experiences of ordinary people may be underrepresented.

This creates a form of selection bias in the historical record itself.

Key Lesson

History is often shaped by what survived rather than by what actually existed.

12. Investment Fund Performance

Investment databases frequently report the historical performance of mutual funds, hedge funds, and exchange-traded funds.

However, poorly performing funds often close and disappear from the dataset.

Funds that survive tend to have stronger performance records.

When analysts calculate average returns using only active funds, the results appear better than reality.

This is another classic example of survivorship bias.

Key Lesson

Performance data may exclude the investments that failed.

Types of Selection Bias

Types of selection bias including sampling bias volunteer bias survivorship bias attrition bias and publication bias
Selection bias includes several related statistical problems that can distort research findings.

Selection bias is a broad category that includes several related statistical problems. Understanding these subtypes makes it easier to recognize misleading data in the real world.

Sampling Bias

Sampling bias occurs when certain members of a population are more likely to be included in a study than others.

For example, surveying only college students to understand the opinions of all adults would create sampling bias because the sample excludes large portions of the population.

Volunteer Bias

Volunteer bias occurs when participation is optional and the people who choose to participate differ from those who do not.

Online surveys, customer feedback forms, and social media polls are common examples.

Survivorship Bias

Survivorship bias occurs when only successful outcomes are analyzed while failures are ignored.

The World War II aircraft example is one of the best-known cases.

Attrition Bias

Attrition bias occurs when participants drop out of a study over time.

If the people who leave differ significantly from those who remain, the final results may become misleading.

Publication Bias

Publication bias occurs when studies with positive results are more likely to be published than studies with negative or inconclusive findings.

This can create a distorted view of scientific evidence.

How to Spot Selection Bias

Checklist for identifying selection bias in surveys studies and statistical datasets
Asking who is missing from the data is often the fastest way to identify selection bias.

Identifying selection bias is often easier when you ask a few simple questions.

1. Who Is Missing?

The most important question is often the simplest.

Who was excluded from the sample?

If important groups are missing, the results may not represent reality.

2. How Was the Data Collected?

Examine the methodology.

Was the sample randomly selected, or did participants choose themselves?

Convenience samples frequently introduce bias.

3. Are Only Success Stories Being Studied?

Whenever you see examples of successful people, companies, or investments, ask whether failures were excluded from the analysis.

4. Were There Non-Responders?

Many surveys suffer from low response rates.

People who ignore surveys may have very different opinions from those who respond.

5. Is There Evidence of Publication Bias?

In scientific research, check whether unpublished studies may exist.

A published result does not necessarily represent all available evidence.

How to Avoid Selection Bias

While selection bias cannot always be eliminated completely, several strategies can significantly reduce its impact.

Use Random Sampling

Random sampling gives every member of a population an equal chance of being selected.

This is one of the most effective ways to improve representativeness.

Increase Response Rates

Researchers should use follow-up reminders and incentives to encourage participation from a wider range of individuals.

Track Missing Data

Understanding who dropped out or failed to respond can reveal potential sources of bias.

Seek Negative Results

Researchers and decision-makers should actively look for evidence that contradicts their assumptions.

Compare Multiple Sources

No dataset is perfect.

Comparing findings across multiple sources can help identify inconsistencies caused by selection bias.

This section was written with the help of American Statistical Association.

Why Selection Bias Matters More Than Ever

In today’s world, we are surrounded by data.

Companies collect customer information, governments publish statistics, researchers release studies, and social media platforms generate endless streams of metrics.

Yet more data does not automatically mean better conclusions.

If the underlying sample is biased, even sophisticated analysis can produce misleading results.

This is why selection bias remains one of the most important concepts in statistics. Whether you’re evaluating a scientific study, reading a news article, interpreting a business report, or comparing investment opportunities, understanding how data was selected is often more important than the numbers themselves.

People frequently assume that statistics are objective. In reality, statistics are only as reliable as the data used to produce them.

Further Reading

To strengthen your understanding of misleading data and statistical reasoning, explore these related guides:

Together, these topics provide a strong foundation for recognizing common ways data can distort reality.

Frequently Asked Questions

What is a simple example of selection bias?

A classic example is studying only World War II aircraft that returned from missions. Because destroyed planes were excluded from the data, analysts initially reached the wrong conclusion about where armor was needed.

What causes selection bias?

Selection bias occurs whenever the sample being analyzed differs systematically from the population it is supposed to represent. This can happen through poor sampling methods, self-selection, non-response, publication practices, or participant dropouts.

Is survivorship bias a type of selection bias?

Yes. Survivorship bias is one of the most common forms of selection bias. It occurs when only successful outcomes are included in an analysis while failures are ignored.

Why is selection bias important in statistics?

Selection bias can lead to incorrect conclusions even when calculations are performed correctly. If the sample is not representative, the results may not reflect reality.

How does selection bias affect research?

Selection bias can exaggerate treatment effects, distort survey results, and create misleading scientific conclusions. Researchers use techniques such as random sampling and study pre-registration to reduce its impact.

Can selection bias be completely eliminated?

In practice, completely eliminating selection bias is difficult. However, careful study design, representative sampling, and transparency about limitations can greatly reduce the problem.

What is the difference between selection bias and sampling bias?

Sampling bias is a specific type of selection bias. Sampling bias refers to problems with how participants are selected, while selection bias is the broader concept that includes several related forms of systematic distortion.

Conclusion

These selection bias examples demonstrate how easily data can become misleading when the sample does not accurately represent reality.

From World War II aircraft and medical research to social media polls and investment performance, selection bias appears in countless situations where important information is missing from the analysis.

The challenge is that selection bias often remains invisible. The numbers may look accurate. The charts may appear convincing. The conclusions may sound reasonable. Yet if the underlying sample is flawed, the entire analysis can become misleading.

The next time you encounter a statistic, survey, study, or headline, take a moment to ask a simple question:

Who is missing from the data?

That single question can help you identify selection bias, avoid misleading statistics, and make better decisions based on evidence rather than incomplete information.

Many of the most famous misleading statistics examples are ultimately caused by selection bias.