How to do thematic analysis
Thematic analysis is a method of analyzing qualitative data. It is usually applied to a set of texts, such as interview transcripts. The researcher closely examines the data to identify common themes – topics, ideas and patterns of meaning that come up repeatedly.
There are various approaches to conducting thematic analysis, but the most common form follows a six-step process: familiarization, coding, generating themes, reviewing themes, defining and naming themes, and writing up.
This process was originally developed for psychology research by Virginia Braun and Victoria Clarke. However, thematic analysis is a flexible method that can be adapted to many different kinds of research.
When to use thematic analysis
Thematic analysis is a good approach to research where you’re trying to find out something about people’s views, opinions, knowledge, experiences or values from a set of qualitative data – for example, interview transcripts, social media profiles, or survey responses.
Some types of research questions you might use thematic analysis to answer:
- How do patients perceive doctors in a hospital setting?
- What are young women’s experiences on dating sites?
- What are non-experts’ ideas and opinions about climate change?
- How is gender constructed in high school history teaching?
To answer any of these questions, you would collect data from a group of relevant participants and then analyze it. Thematic analysis allows you a lot of flexibility in interpreting the data, and allows you to approach large data sets more easily by sorting them into broad themes.
However, it also involves the risk of missing nuances in the data. Thematic analysis is often quite subjective and relies on the researcher’s judgement, so you have to reflect carefully on your own choices and interpretations.
Pay close attention to the data to ensure that you’re not picking up on things that are not there – or obscuring things that are.
Different approaches to thematic analysis
Once you’ve decided to use thematic analysis, there are different approaches to consider.
There’s the distinction between inductive and deductive approaches:
- An inductive approach involves allowing the data to determine your themes.
- A deductive approach involves coming to the data with some preconceived themes you expect to find reflected there, based on theory or existing knowledge.
Ask yourself: Does my theoretical framework give me a strong idea of what kind of themes I expect to find in the data (deductive), or am I planning to develop my own framework based on what I find (inductive)?
There’s also the distinction between a semantic and a latent approach:
- A semantic approach involves analyzing the explicit content of the data.
- A latent approach involves reading into the subtext and assumptions underlying the data.
Ask yourself: Am I interested in people’s stated opinions (semantic) or in what their statements reveal about their assumptions and social context (latent)?
After you’ve decided thematic analysis is the right method for analyzing your data, and you’ve thought about the approach you’re going to take, you can follow the six steps developed by Braun and Clarke.
Step 1: Familiarization
The first step is to get to know our data. It’s important to get a thorough overview of all the data we collected before we start analyzing individual items.
This might involve transcribing audio, reading through the text and taking initial notes, and generally looking through the data to get familiar with it.
Step 2: Coding
Next up, we need to code the data. Coding means highlighting sections of our text – usually phrases or sentences – and coming up with shorthand labels or “codes” to describe their content.
Let’s take a short example text. Say we’re researching perceptions of climate change among conservative voters aged 50 and up, and we have collected data through a series of interviews. An extract from one interview looks like this:
|Personally, I’m not sure. I think the climate is changing, sure, but I don’t know why or how. People say you should trust the experts, but who’s to say they don’t have their own reasons for pushing this narrative? I’m not saying they’re wrong, I’m just saying there’s reasons not to 100% trust them. The facts keep changing – it used to be called global warming.||
In this extract, we’ve highlighted various phrases in different colors corresponding to different codes. Each code describes the idea or feeling expressed in that part of the text.
At this stage, we want to be thorough: we go through the transcript of every interview and highlight everything that jumps out as relevant or potentially interesting. As well as highlighting all the phrases and sentences that match these codes, we can keep adding new codes as we go through the text.
After we’ve been through the text, we collate together all the data into groups identified by code. These codes allow us to gain a a condensed overview of the main points and common meanings that recur throughout the data.
Step 3: Generating themes
Next, we look over the codes we’ve created, identify patterns among them, and start coming up with themes.
Themes are generally broader than codes. Most of the time, you’ll combine several codes into a single theme. In our example, we might start combining codes into themes like this:
||Distrust of experts|
At this stage, we might decide that some of our codes are too vague or not relevant enough (for example, because they don’t appear very often in the data), so they can be discarded.
Other codes might become themes in their own right. In our example, we decided that the code “uncertainty” made sense as a theme, with some other codes incorporated into it.
Again, what we decide will vary according to what we’re trying to find out. We want to create potential themes that tell us something helpful about the data for our purposes.
Step 4: Reviewing themes
Now we have to make sure that our themes are useful and accurate representations of the data. Here, we return to the data set and compare our themes against it. Are we missing anything? Are these themes really present in the data? What can we change to make our themes work better?
If we encounter problems with our themes, we might split them up, combine them, discard them or create new ones: whatever makes them more useful and accurate.
For example, we might decide upon looking through the data that “changing terminology” fits better under the “uncertainty” theme than under “distrust of experts,” since the data labelled with this code involves confusion, not necessarily distrust.
Step 5: Defining and naming themes
Now that you have a final list of themes, it’s time to name and define each of them.
Defining themes involves formulating exactly what we mean by each theme and figuring out how it helps us understand the data.
Naming themes involves coming up with a succinct and easily understandable name for each theme.
For example, we might look at “distrust of experts” and determine exactly who we mean by “experts” in this theme. We might decide that a better name for the theme is “distrust of authority” or “conspiracy thinking”.
Step 6: Writing up
Finally, we’ll write up our analysis of the data. Like all academic texts, writing up a thematic analysis requires an introduction to establish our research question, aims and approach.
We should also include a methodology section, describing how we collected the data (e.g. through semi-structured interviews or open-ended survey questions) and explaining how we conducted the thematic analysis itself.
The results or findings section usually addresses each theme in turn. We describe how often the themes come up and what they mean, including examples from the data as evidence. Finally, our conclusion explains the main takeaways and shows how the analysis has answered our research question.
In our example, we might argue that conspiracy thinking about climate change is widespread among older conservative voters, point out the uncertainty with which many voters view the issue, and discuss the role of misinformation in respondents’ perceptions.