Population vs sample: what’s the difference?
A population is the entire group that you want to draw conclusions about.
A sample is the specific group that you will collect data from. The size of the sample is always less than the total size of the population.
In research, a population doesn’t always refer to people. It can mean a group containing elements of anything you want to study, such as objects, events, organizations, countries, species, organisms, etc.
|Advertisements for IT jobs in the Netherlands||The top 50 search results for advertisements for IT jobs in the Netherlands on May 1, 2020|
|Songs from the Eurovision Song Contest||Winning songs from the Eurovision Song Contest that were performed in English|
|Undergraduate students in the Netherlands||300 undergraduate students from three Dutch universities who volunteer for your psychology research study|
|All countries of the world||Countries with published data available on birth rates and GDP since 2000|
Collecting data from a population
Populations are used when your research question requires, or when you have access to, data from every member of the population.
Usually, it is only straightforward to collect data from a whole population when it is small, accessible and cooperative.
For larger and more dispersed populations, it is often difficult or impossible to collect data from every individual. For example, every 10 years, the federal US government aims to count every person living in the country using the US Census. This data is used to distribute funding across the nation.
However, historically, marginalized and low-income groups have been difficult to contact, locate and encourage participation from. Because of non-responses, the population count is incomplete and biased towards some groups, which results in disproportionate funding across the country.
In cases like this, sampling can be used to make more precise inferences about the population.
Collecting data from a sample
When your population is large in size, geographically dispersed, or difficult to contact, it’s necessary to use a sample. You can use sample data to make estimates or test hypotheses about population data.
Ideally, a sample should be randomly selected and representative of the population. Using probability sampling methods (such as simple random sampling or stratified sampling) reduces the risk of sampling bias and enhances both internal and external validity.
If your research is less concerned with generalizability, you can also use non-probability sampling methods. Non-probability samples are chosen for specific criteria; they may be more convenient or cheaper to access. Because of non-random selection methods, you can’t make valid statistical inferences about the broader population.
Reasons for sampling
- Necessity: Sometimes it’s simply not possible to study the whole population due to its size or inaccessibility.
- Practicality: It’s easier and more efficient to collect data from a sample.
- Cost-effectiveness: There are fewer participant, laboratory, equipment, and researcher costs involved.
- Manageability: Storing and running statistical analyses on smaller datasets is easier and reliable.
Population parameter vs sample statistic
When you collect data from a population or a sample, there are various measurements and numbers you can calculate from the data. A parameter is a measure that describes the whole population. A statistic is a measure that describes the sample.
You can use estimation or hypothesis testing to estimate how likely it is that a sample statistic differs from the population parameter.
A sampling error is the difference between a population parameter and a sample statistic. In your study, the sampling error is the difference between the mean political attitude rating of your sample and the true mean political attitude rating of all undergraduate students in the Netherlands.
Sampling errors happen even when you use a randomly selected sample. This is because random samples are not identical to the population in terms of numerical measures like means and standard deviations.
Because the aim of scientific research is to generalize findings from the sample to the population, you want the sampling error to be low. You can reduce sampling error by increasing the sample size.
Quiz: Populations vs samples
Frequently asked questions about samples and populations
- Why are samples used in research?
Samples are used to make inferences about populations. Samples are easier to collect data from because they are practical, cost-effective, convenient and manageable.
- When are populations used in research?
- What’s the difference between a statistic and a parameter?
A statistic refers to measures about the sample, while a parameter refers to measures about the population.
- What is sampling error?
A sampling error is the difference between a population parameter and a sample statistic.