A step-by-step guide to data collection

Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem.

While methods and aims may differ between fields, the overall process of data collection remains largely the same. Before you begin collecting data, you need to consider:

  • The aim of the research
  • The type of data that you will collect
  • The methods and procedures you will use to collect, store, and process the data

To collect high-quality data that is relevant to your purposes, follow these four steps.

Step 1: Define the aim of your research

Before you start the process of data collection, you need to identify exactly what you want to achieve. You can start by writing a problem statement: what is the practical or scientific issue that you want to address and why does it matter?

Next, formulate one or more research questions that precisely define what you want to find out. Depending on your research questions, you might need to collect quantitative or qualitative data:

  • Quantitative data is expressed in numbers and graphs and is analyzed through statistical methods.
  • Qualitative data is expressed in words and analyzed through interpretations and categorizations.

If your aim is to test a hypothesis, measure something precisely, or gain large-scale statistical insights, collect quantitative data. If your aim is to explore ideas, understand experiences, or gain detailed insights into a specific context, collect qualitative data. If you have several aims, you can use a mixed methods approach that collects both types of data.

Examples of quantitative and qualitative research aims
You are researching employee perceptions of their direct managers in a large organization.

  • Your first aim is to assess whether there are significant differences in perceptions of managers across different departments and office locations.
  • Your second aim is to gather meaningful feedback from employees to explore new ideas for how managers can improve.

You decide to use a mixed-methods approach to collect both quantitative and qualitative data.

Step 2: Choose your data collection method

Based on the data you want to collect, decide which method is best suited for your research.

  • Experimental research is primarily a quantitative method.
  • Interviews/focus groups and ethnography are qualitative methods.
  • Surveys, observations, archival research and secondary data collection can be quantitative or qualitative methods.

Carefully consider what method you will use to gather data that helps you directly answer your research questions.

Data collection methods
MethodWhen to useHow to collect data
ExperimentTo test a causal relationship.Manipulate variables and measure their effects on others.
SurveyTo understand the general characteristics or opinions of a group of people.Distribute a list of questions to a sample online, in person or over-the-phone.
Interview/focus groupTo gain an in-depth understanding of perceptions or opinions on a topic.Verbally ask participants open-ended questions in individual interviews or focus group discussions.
ObservationTo understand something in its natural setting.Measure or survey a sample without trying to affect them.
EthnographyTo study the culture of a community or organization first-hand.Join and participate in a community and record your observations and reflections.
Archival researchTo understand current or historical events, conditions or practices.Access manuscripts, documents or records from libraries, depositories or the internet.
Secondary data collectionTo analyze data from populations that you can’t access first-hand.Find existing datasets that have already been collected, from sources such as government agencies or research organizations.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Step 3: Plan your data collection procedures

When you know which method(s) you are using, you need to plan exactly how you will implement them. What procedures will you follow to make accurate observations or measurements of the variables you are interested in?

For instance, if you’re conducting surveys or interviews, decide what form the questions will take; if you’re conducting an experiment, make decisions about your experimental design.

Operationalization

Sometimes your variables can be measured directly: for example, you can collect data on the average age of employees simply by asking for dates of birth. However, often you’ll be interested in collecting data on more abstract concepts or variables that can’t be directly observed.

Operationalization means turning abstract conceptual ideas into measurable observations. When planning how you will collect data, you need to translate the conceptual definition of what you want to study into the operational definition of what you will actually measure.

Example of operationalization
You have decided to use surveys to collect quantitative data. The concept you want to measure is the leadership of managers. You operationalize this concept in two ways:

  • You ask managers to rate their own leadership skills on 5-point scales assessing the ability to delegate, decisiveness and dependability.
  • You ask their direct employees to provide anonymous feedback on the managers regarding the same topics.

Using multiple ratings of a single concept can help you cross-check your data and assess the test validity of your measures.

Sampling

You may need to develop a sampling plan to obtain data systematically. This involves defining a population, the group you want to draw conclusions about, and a sample, the group you will actually collect data from.

Your sampling method will determine how you recruit participants or obtain measurements for your study. To decide on a sampling method you will need to consider factors like the required sample size, accessibility of the sample, and timeframe of the data collection.

Standardizing procedures

If multiple researchers are involved, write a detailed manual to standardize data collection procedures in your study.

This means laying out specific step-by-step instructions so that everyone in your research team collects data in a consistent way – for example, by conducting experiments under the same conditions and using objective criteria to record and categorize observations.

This helps ensure the reliability of your data, and you can also use it to replicate the study in the future.

Creating a data management plan

Before beginning data collection, you should also decide how you will organize and store your data.

  • If you are collecting data from people, you will likely need to anonymize and safeguard the data to prevent leaks of sensitive information (e.g. names or identity numbers).
  • If you are collecting data via interviews or pencil-and-paper formats, you will need to perform transcriptions or data entry in systematic ways to minimize distortion.
  • You can prevent loss of data by having an organization system that is routinely backed up.

Step 4: Collect the data

Finally, you can implement your chosen methods to measure or observe the variables you are interested in.

Examples of collecting qualitative and quantitative data
To collect data about perceptions of managers, you administer a survey with closed- and open-ended questions to a sample of 300 company employees across different departments and locations.

The closed-ended questions ask participants to rate their manager’s leadership skills on scales from 1–5. The data produced is numerical and can be statistically analyzed for averages and patterns.

The open-ended questions ask participants for examples of what the manager is doing well now and what they can do better in the future. The data produced is qualitative and can be categorized through content analysis for further insights.

To ensure that high quality data is recorded in a systematic way, here are some best practices:

  • Record all relevant information as and when you obtain data. For example, note down whether or how lab equipment is recalibrated during an experimental study.
  • Double-check manual data entry for errors.
  • If you collect quantitative data, you can assess the reliability and validity to get an indication of your data quality.

Frequently asked questions about data collection

What is data collection?

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

What are the benefits of collecting data?

When conducting research, collecting original data has significant advantages:

However, there are also some drawbacks: data collection can be time-consuming, labor-intensive and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

What’s the difference between quantitative and qualitative methods?

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to test a hypothesis by systematically collecting and analyzing data, while qualitative methods allow you to explore ideas and experiences in depth.

What’s the difference between reliability and validity?

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity refers to the accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research, you also have to consider the internal and external validity of your experiment.

What is operationalization?

Operationalization means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data, it’s important to consider how you will operationalize the variables that you want to measure.

Is this article helpful?
Pritha Bhandari

Pritha has an academic background in English, psychology and cognitive neuroscience. As an interdisciplinary researcher, she enjoys writing articles explaining tricky research concepts for students and academics.

2 comments

Zandile
June 6, 2020 at 7:11 PM

Thank you for the information and how it was broken down. It is very clear and informative.

Reply

Comment or ask a question.

Please click the checkbox on the left to verify that you are a not a bot.