What Is Undercoverage Bias? | Definition & Example

Undercoverage bias occurs when a part of the population is excluded from your sample. As a result, the sample is no longer representative of the target population. Non-probability sampling designs are susceptible to this type of research bias.

Example: Undercoverage bias
You are conducting research by randomly calling landline numbers. Because of your sampling method, individuals who only have mobile phones are not sampled. In this case, they are not merely undercovered, but not covered at all.

Undercoverage is a type of selection bias.

What is undercoverage bias?

Undercoverage bias is the systematic distortion of a study’s findings due to the way the sample was selected.

Ideally, researchers should draw a sample that, like a snapshot, adequately captures characteristics that are both present in the target population and relevant for the research. In other words, researchers aim to collect a representative sample.

In some cases, researchers may sample too few units from a specific segment of the population. If the segment is small in comparison to others in the population, this may not impact the research findings much. However, if the segment is larger, it can lead to a sample that doesn’t accurately capture the characteristics of the population.

In more extreme cases, researchers may completely fail to include a part of the population, which can distort the findings completely.

Keep in mind that two things must both happen for undercoverage bias to occur:

  • Some segments of the population have not been included in your sample but should have been
  • The included segments are different from the excluded ones in terms of one or more variables of interest
Example: Undercoverage bias
Online surveys exclude people who don’t have internet access. Previous research shows that internet access also relates to demographics like socioeconomic status and age.

Depending on your research objective, this may impact your results. If your research is on online shopping habits, opting for an online survey is a legitimate choice. If however, you are interested in people’s voting intentions, running an online survey excludes a significant part of the population, leading to undercoverage bias.

If your sampling frame excludes a large part of your target population, you need to step back and consider how the excluded units may systematically differ from those included in your sample.

Undercoverage bias vs. nonresponse bias

Although undercoverage bias and nonresponse bias may seem similar, they are actually quite different.

  • Undercoverage bias occurs when some members of a population are totally excluded from the sample frame you use for your study.
  • Nonresponse bias occurs when some of the respondents you selected to be in your sample don’t respond.

In other words, undercoverage means that some units never make it into the sample or are inadequately represented. Nonresponse means that some units are included in the sample, but their responses are missing.

Undercoverage Bias

What causes undercoverage bias?

There are two main sources of undercoverage bias:

Non-probability sampling

Non-probability sampling designs like convenience sampling are almost always biased. When researchers recruit study participants based on proximity or ease of access, their results can’t be representative of the population. The reason is that not all members of the population of interest have an equal chance of being selected for the survey.

For example, if you stand at a shopping mall and select shoppers as they walk by to fill out a survey, you are neglecting to survey everyone not at the mall that day.

Incomplete sampling frames

Probability samples are not immune to undercoverage bias either. Simple random sampling can also yield biased results if the sampling frame is incomplete.

For example, if you use email lists or phone lists as a sampling frame, error may be introduced due to the makeup of the list. In other words, the individuals included in the frame may differ from those who are not. As a result, a segment of the population is not sampled at all or is underrepresented in the sample.

Undercoverage bias example

Undercoverage varies in severity depending on the population studied.

Example: Incomplete sampling frame 
You are researching the self-reported health status of the employees at a big law firm. The human resources department agrees to give you a list of all employee email addresses. You use this list to draw a sample and ask them to fill in your survey.

However, the list is outdated. More recent recruits to the firm are not in your sampling frame because they weren’t on the email address list. If these new employees are not sampled, your survey results will be affected by undercoverage bias.

If there are only a couple of recent hires, this may not be a problem, and the discrepancy between the target population and the sample will be small. If the law firm recently hired many new employees, the discrepancy will be bigger, and so will the extent of the undercoverage bias.

How to avoid undercoverage bias

There are a few steps you can take to shield your research from undercoverage bias:

  • Familiarize yourself with your target population. Understanding your target population allows you to capture all relevant characteristics and subgroups.
  • Run a pilot survey. Before launching your survey, consider performing a trial run with fewer respondents. In this way, you can spot errors like undercoverage bias before launching the actual survey.
  • Combine multiple sources of data to build your sampling frame. For example, researchers can create housing unit frames by using an already-existing list of addresses. Next, they update them in the field by adding units that are missing and removing those that don’t exist anymore or are not residential.

Use probability sampling. If the goal of your research is to draw a representative sample, probability samples will allow you to safely generalize your findings to a large population better than non-probability samples.

Other types of research bias

Frequently asked questions

What is the difference between undercoverage and nonresponse bias?

Undercoverage bias happens when segments of the target population are entirely excluded or less represented in the sample than they are in the population. This means that these segments are excluded from the sampling process.

Nonresponse bias occurs when parts of the sampled population are unable or refuse to respond. In other words, nonrespondents are included in the sampling process, but their answers (responses) are not registered.

What is undercoverage bias in statistics?

Undercoverage bias in statistics is the underrepresentation of a segment of the target population in the sample. If the distribution of characteristics between the target population and the sample is significantly different, it is likely that the dataset has undercoverage bias.

Sources in this article

We strongly encourage students to use sources in their work. You can cite our article (APA Style) or take a deep dive into the articles below.

This Scribbr article

Nikolopoulou, K. (2023, March 24). What Is Undercoverage Bias? | Definition & Example. Scribbr. Retrieved April 22, 2024, from https://www.scribbr.com/research-bias/undercoverage-bias/

Sources

Eckman, S., & Kreuter, F. (2013). Undercoverage Rates and Undercoverage Bias in Traditional Housing Unit Listing. Sociological Methods &Amp; Research, 42(3), 264–293. https://doi.org/10.1177/0049124113500477

Is this article helpful?
Kassiani Nikolopoulou

Kassiani has an academic background in Communication, Bioeconomy and Circular Economy. As a former journalist she enjoys turning complex scientific information into easily accessible articles to help students. She specializes in writing about research methods and research bias.