Population vs sample: what’s the difference?

Population vs sample

A population is the entire group that you want to draw conclusions about.

A sample is the specific group that you will collect data from. The size of the sample is always less than the total size of the population.

In research, a population doesn’t always refer to people. It can mean a group containing elements of anything you want to study, such as objects, events, organizations, countries, species, organisms, etc.

Population vs sample
PopulationSample
Advertisements for IT jobs in the NetherlandsThe top 50 search results for advertisements for IT jobs in the Netherlands on May 1, 2020
Songs from the Eurovision Song ContestWinning songs from the Eurovision Song Contest that were performed in English
Undergraduate students in the Netherlands300 undergraduate students from three Dutch universities who volunteer for your psychology research study
All countries of the worldCountries with published data available on birth rates and GDP since 2000

Collecting data from a population

Populations are used when your research question requires, or when you have access to, data from every member of the population.

Usually, it is only straightforward to collect data from a whole population when it is small, accessible and cooperative.

Example: Collecting data from a population
A high school administrator wants to analyze the final exam scores of all graduating seniors to see if there is a trend. Since they are only interested in applying their findings to the graduating seniors in this high school, they use the whole population dataset.

For larger and more dispersed populations, it is often difficult or impossible to collect data from every individual. For example, every 10 years, the federal US government aims to count every person living in the country using the US Census. This data is used to distribute funding across the nation.

However, historically, marginalized and low-income groups have been difficult to contact, locate and encourage participation from. Because of non-responses, the population count is incomplete and biased towards some groups, which results in disproportionate funding across the country.

In cases like this, sampling can be used to make more precise inferences about the population.

Collecting data from a sample

When your population is large in size, geographically dispersed, or difficult to contact, it’s necessary to use a sample. You can use sample data to make estimates or test hypotheses about population data.

Example: Collecting data from a sample
You want to study political attitudes in young people. Your population is the 300,000 undergraduate students in the Netherlands. Because it’s not practical to collect data from all of them, you use a sample of 300 undergraduate volunteers from three Dutch universities – this is the group who will complete your online survey.

Ideally, a sample should be randomly selected and representative of the population. Using  probability sampling methods (such as simple random sampling or stratified sampling) reduces the risk of sampling bias and enhances both internal and external validity.

If your research is less concerned with generalizability, you can also use non-probability sampling methods. Non-probability samples are chosen for specific criteria; they may be more convenient or cheaper to access. Because of non-random selection methods, you can’t make valid statistical inferences about the broader population.

Reasons for sampling

  • Necessity: Sometimes it’s simply not possible to study the whole population due to its size or inaccessibility.
  • Practicality: It’s easier and more efficient to collect data from a sample.
  • Cost-effectiveness: There are fewer participant, laboratory, equipment, and researcher costs involved.
  • Manageability: Storing and running statistical analyses on smaller datasets is easier and reliable.

What can proofreading do for your paper?

Scribbr editors not only correct grammar and spelling mistakes, but also strengthen your writing by making sure your paper is free of vague language, redundant words and awkward phrasing.

See editing example

Population parameter vs sample statistic

When you collect data from a population or a sample, there are various measurements and numbers you can calculate from the data. A parameter is a measure that describes the whole population. A statistic is a measure that describes the sample.

You can use estimation or hypothesis testing to estimate how likely it is that a sample statistic differs from the population parameter.

Research example: Parameters and statistics
In your study of students’ political attitudes, you ask your survey participants to rate themselves on a scale from 1, very liberal, to 7, very conservative. You find that most of your sample identifies as liberal – the mean rating on the political attitudes scale is 3.2.

You can use this statistic, the sample mean of 3.2, to make a scientific guess about the population parameter – that is, to infer the mean political attitude rating of all undergraduate students in the Netherlands.

Sampling error

A sampling error is the difference between a population parameter and a sample statistic. In your study, the sampling error is the difference between the mean political attitude rating of your sample and the true mean political attitude rating of all undergraduate students in the Netherlands.

Sampling errors happen even when you use a randomly selected sample. This is because random samples are not identical to the population in terms of numerical measures like means and standard deviations.

Because the aim of scientific research is to generalize findings from the sample to the population, you want the sampling error to be low. You can reduce sampling error by increasing the sample size.

Quiz: Populations vs samples

Frequently asked questions about samples and populations

Why are samples used in research?

Samples are used to make inferences about populations. Samples are easier to collect data from because they are practical, cost-effective, convenient and manageable.

When are populations used in research?

Populations are used when a research question requires data from every member of the population. This is usually only feasible when the population is small and easily accessible.

What’s the difference between a statistic and a parameter?

A statistic refers to measures about the sample, while a parameter refers to measures about the population.

What is sampling error?

A sampling error is the difference between a population parameter and a sample statistic.

Is this article helpful?
Pritha Bhandari

Pritha has an academic background in English, psychology and cognitive neuroscience. As an interdisciplinary researcher, she enjoys writing articles explaining tricky research concepts for students and academics.

Comment or ask a question.

Please click the checkbox on the left to verify that you are a not a bot.