Sampling bias: What is it and why does it matter?

Sampling bias occurs when some members of a population are systematically more likely to be selected in a sample than others. It is also called ascertainment bias in medical fields.

Sampling bias limits the generalizability of findings because it is a threat to external validity, specifically population validity. In other words, findings from biased samples can only be generalized to populations that share characteristics with the sample.

Causes of sampling bias

Your choice of research design or data collection method can lead to sampling bias. Sampling bias can occur in both probability and non-probability sampling.

Sampling bias in probability samples

In probability sampling, every member of the population has a known chance of being selected. For instance, you can use a random number generator to select a simple random sample from your population.

Although this procedure reduces the risk of sampling bias, it may not eliminate it. If your sampling frame – the actual list of individuals that the sample is drawn from – does not match the population, this can result in a biased sample.

Example of sampling bias in a simple random sample
You want to study procrastination and social anxiety levels in undergraduate students at your university using a simple random sample. You assign a number to every student in the research participant database from 1 to 1500 and use a random number generator to select 120 numbers.

Although you used a random sample, not every member of your target population –undergraduate students at your university – had a chance of being selected. Your sample misses anyone who did not sign up to be contacted about participating in research. This may bias your sample towards people who have less social anxiety and are more willing to participate in research.

Sampling bias in non-probability samples

A non-probability sample is selected based on non-random criteria. For instance, in a convenience sample, participants are selected based on accessibility and availability.

Non-probability sampling often results in biased samples because some members of the population are more likely to be included than others.

Example of sampling bias in a convenience sample
You want to study the popularity of plant-based foods amongst undergraduate students at your university. For convenience, you send out a survey to everyone enrolled in Introduction to Psychology courses at your university. They all complete it in exchange for course credits.

Because this is a convenience sample, it is not representative of your target population. People who take this course may be more liberal and drawn towards plant-based foods than others at your university.

Types of sampling bias

Type Explanation Example
Self-selection People with specific characteristics are more likely to agree to take part in a study than others. People who are more thrill-seeking are likely to take part in pain research studies. This may skew the data.
Non-response People who refuse to participate or drop out from a study systematically differ from those who take part. In a study on stress and workload, employees with high workloads are less likely to participate. The resulting sample may not vary greatly in terms of workload.
Undercoverage Some members of a population are inadequately represented in the sample. Administering general national surveys online may miss groups with limited internet access, such as the elderly and lower-income households.
Survivorship Successful observations, people and objects are more likely to be represented in the sample than unsuccessful ones. In scientific journals, there is strong publication bias towards positive results. Successful research outcomes are published far more often than null findings.
Pre-screening or advertising The way participants are pre-screened or where a study is advertised may bias a sample. When seeking volunteers to test a novel sleep intervention, you may end up with a sample that is more motivated to improve their sleep habits than the rest of the population. As a result, they may have been likely to improve their sleep habits regardless of the effects of your intervention.
Healthy user Volunteers for preventative interventions are more likely to pursue health-boosting behaviors and activities than other members of the population. A sample in a preventative intervention has a better diet, higher physical activity levels, abstains from alcohol, and avoids smoking more than most of the population. The experimental findings may be a result of the treatment interacting with these characteristics of the sample, rather than just the treatment itself.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

How to avoid or correct sampling bias

Using careful research design and sampling procedures can help you avoid sampling bias.

  • Define a target population and a sampling frame (the list of individuals that the sample will be drawn from). Match the sampling frame to the target population as much as possible to reduce the risk of sampling bias.
  • Make online surveys as short and accessible as possible.
  • Follow up on non-responders.
  • Avoid convenience sampling.

Oversampling to avoid bias

Oversampling can be used to avoid sampling bias in situations where members of defined groups are underrepresented (undercoverage). This is a method of selecting respondents from some groups so that they make up a larger share of a sample than they actually do the population.

After all data is collected, responses from oversampled groups are weighted to their actual share of the population to remove any sampling bias.

Example of oversampling to avoid sampling bias
A researcher wants to study the political opinions of different ethnic groups in the US and focus in depth on Asian Americans, who make up only 5.6% of the US population. The researcher wants to study each ethnic group separately, but also gather enough data about Asian Americans for precise conclusions.

They gather a nationally representative sample, with 1500 respondents, that oversamples Asian Americans. Random digit dialling is used to contact American households, and disproportionately larger samples are taken from regions with more Asian Americans. Of the 1500 respondents, 336 are Asian American. Based on this sample size, the researcher can be confident in their findings about Asian Americans.

Weighting is applied to ensure that the responses of Asian Americans account for 5.6% of the total. This allows for accurate estimates of the sample as a whole.

Frequently asked questions about sampling bias

What is sampling?

A sample is a subset of individuals from a larger population. Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

What is sampling bias?

Sampling bias occurs when some members of a population are systematically more likely to be selected in a sample than others.

Why is sampling bias important?

Sampling bias is a threat to external validity – it limits the generalizability of your findings to a broader group of people.

What are some types of sampling bias?

Some common types of sampling bias include self-selection, non-response, undercoverage, survivorship, pre-screening or advertising, and healthy user bias.

How do you avoid sampling bias?

Using careful research design and sampling procedures can help you avoid sampling bias. Oversampling can be used to correct undercoverage bias.

Why are samples used in research?

Samples are used to make inferences about populations. Samples are easier to collect data from because they are practical, cost-effective, convenient and manageable.

Is this article helpful?
Pritha Bhandari

Pritha has an academic background in English, psychology and cognitive neuroscience. As an interdisciplinary researcher, she enjoys writing articles explaining tricky research concepts for students and academics.