How to do thematic analysis

Thematic analysis is a method of analyzing qualitative data. It is usually applied to a set of texts, such as interview transcripts. The researcher closely examines the data to identify common themes – topics, ideas and patterns of meaning that come up repeatedly.

There are various approaches to conducting thematic analysis, but the most common form follows a six-step process:

  1. Familiarization
  2. Coding
  3. Generating themes
  4. Reviewing themes
  5. Defining and naming themes
  6. Writing up

This process was originally developed for psychology research by Virginia Braun and Victoria Clarke. However, thematic analysis is a flexible method that can be adapted to many different kinds of research.

When to use thematic analysis

Thematic analysis is a good approach to research where you’re trying to find out something about people’s views, opinions, knowledge, experiences or values from a set of qualitative data – for example, interview transcripts, social media profiles, or survey responses.

Some types of research questions you might use thematic analysis to answer:

  • How do patients perceive doctors in a hospital setting?
  • What are young women’s experiences on dating sites?
  • What are non-experts’ ideas and opinions about climate change?
  • How is gender constructed in high school history teaching?

To answer any of these questions, you would collect data from a group of relevant participants and then analyze it. Thematic analysis allows you a lot of flexibility in interpreting the data, and allows you to approach large data sets more easily by sorting them into broad themes.

However, it also involves the risk of missing nuances in the data. Thematic analysis is often quite subjective and relies on the researcher’s judgement, so you have to reflect carefully on your own choices and interpretations.

Pay close attention to the data to ensure that you’re not picking up on things that are not there – or obscuring things that are.

Different approaches to thematic analysis

Once you’ve decided to use thematic analysis, there are different approaches to consider.

There’s the distinction between inductive and deductive approaches:

  • An inductive approach involves allowing the data to determine your themes.
  • A deductive approach involves coming to the data with some preconceived themes you expect to find reflected there, based on theory or existing knowledge.

Ask yourself: Does my theoretical framework give me a strong idea of what kind of themes I expect to find in the data (deductive), or am I planning to develop my own framework based on what I find (inductive)?

There’s also the distinction between a semantic and a latent approach:

  • A semantic approach involves analyzing the explicit content of the data.
  • A latent approach involves reading into the subtext and assumptions underlying the data.

Ask yourself: Am I interested in people’s stated opinions (semantic) or in what their statements reveal about their assumptions and social context (latent)?

After you’ve decided thematic analysis is the right method for analyzing your data, and you’ve thought about the approach you’re going to take, you can follow the six steps developed by Braun and Clarke.

Receive feedback on language, structure and layout

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Grammar
  • Style consistency

See an example

Step 1: Familiarization

The first step is to get to know our data. It’s important to get a thorough overview of all the data we collected before we start analyzing individual items.

This might involve transcribing audio, reading through the text and taking initial notes, and generally looking through the data to get familiar with it.

Step 2: Coding

Next up, we need to code the data. Coding means highlighting sections of our text – usually phrases or sentences – and coming up with shorthand labels or “codes” to describe their content.

Let’s take a short example text. Say we’re researching perceptions of climate change among conservative voters aged 50 and up, and we have collected data through a series of interviews. An extract from one interview looks like this:

Coding qualitative data
Interview extractCodes
Personally, I’m not sure. I think the climate is changing, sure, but I don’t know why or how. People say you should trust the experts, but who’s to say they don’t have their own reasons for pushing this narrative? I’m not saying they’re wrong, I’m just saying there’s reasons not to 100% trust them. The facts keep changing – it used to be called global warming.
  • Uncertainty
  • Acknowledgement of climate change
  • Distrust of experts
  • Changing terminology

In this extract, we’ve highlighted various phrases in different colors corresponding to different codes. Each code describes the idea or feeling expressed in that part of the text.

At this stage, we want to be thorough: we go through the transcript of every interview and highlight everything that jumps out as relevant or potentially interesting. As well as highlighting all the phrases and sentences that match these codes, we can keep adding new codes as we go through the text.

After we’ve been through the text, we collate together all the data into groups identified by code. These codes allow us to gain a a condensed overview of the main points and common meanings that recur throughout the data.

Step 3: Generating themes

Next, we look over the codes we’ve created, identify patterns among them, and start coming up with themes.

Themes are generally broader than codes. Most of the time, you’ll combine several codes into a single theme. In our example, we might start combining codes into themes like this:

Turning codes into themes
CodesTheme
  • Uncertainty
  • Leave it to the experts
  • Alternative explanations
Uncertainty
  • Changing terminology
  • Distrust of scientists
  • Resentment toward experts
  • Fear of government control
Distrust of experts
  • Incorrect facts
  • Misunderstanding of science
  • Biased media sources
Misinformation

At this stage, we might decide that some of our codes are too vague or not relevant enough (for example, because they don’t appear very often in the data), so they can be discarded.

Other codes might become themes in their own right. In our example, we decided that the code “uncertainty” made sense as a theme, with some other codes incorporated into it.

Again, what we decide will vary according to what we’re trying to find out. We want to create potential themes that tell us something helpful about the data for our purposes.

Step 4: Reviewing themes

Now we have to make sure that our themes are useful and accurate representations of the data. Here, we return to the data set and compare our themes against it. Are we missing anything? Are these themes really present in the data? What can we change to make our themes work better?

If we encounter problems with our themes, we might split them up, combine them, discard them or create new ones: whatever makes them more useful and accurate.

For example, we might decide upon looking through the data that “changing terminology” fits better under the “uncertainty” theme than under “distrust of experts,” since the data labelled with this code involves confusion, not necessarily distrust.

Step 5: Defining and naming themes

Now that you have a final list of themes, it’s time to name and define each of them.

Defining themes involves formulating exactly what we mean by each theme and figuring out how it helps us understand the data.

Naming themes involves coming up with a succinct and easily understandable name for each theme.

For example, we might look at “distrust of experts” and determine exactly who we mean by “experts” in this theme. We might decide that a better name for the theme is “distrust of authority” or “conspiracy thinking”.

Step 6: Writing up

Finally, we’ll write up our analysis of the data. Like all academic texts, writing up a thematic analysis requires an introduction to establish our research question, aims and approach.

We should also include a methodology section, describing how we collected the data (e.g. through semi-structured interviews or open-ended survey questions) and explaining how we conducted the thematic analysis itself.

The results or findings section usually addresses each theme in turn. We describe how often the themes come up and what they mean, including examples from the data as evidence. Finally, our conclusion explains the main takeaways and shows how the analysis has answered our research question.

In our example, we might argue that conspiracy thinking about climate change is widespread among older conservative voters, point out the uncertainty with which many voters view the issue, and discuss the role of misinformation in respondents’ perceptions.

Is this article helpful?
Jack Caulfield

Jack is a Brit based in Amsterdam, with an MA in comparative literature. He writes and edits for Scribbr, and reads a lot of books in his spare time.

24 comments

SAMAN
September 14, 2020 at 10:22 PM

Thank you so much, it's help alot.

Reply

Chareen
September 7, 2020 at 7:00 AM

This was really helpful. Thanks

Reply

Michael
September 6, 2020 at 11:05 AM

Thanks, very helpful

Reply

Tamiru
September 3, 2020 at 1:01 PM

extending my thanks to the writer of this article, I would like him to react on my question I forwarded billow!
among the steps to be followed during thematic analysis you explained, I am a bit confused on how should I summarize my finding resulted from divergent thematized data?

Reply

david kelapile
August 20, 2020 at 4:18 PM

This was very helpful, clear and indeed articulate.

Reply

Masoud
August 20, 2020 at 11:54 AM

Thank u so much, its quite understandable and helpful.

Reply

Evie
July 26, 2020 at 5:39 AM

It is very educational

Reply

Daisy
July 17, 2020 at 6:49 PM

Very helpful indeed. Thank you so much.

Reply

Bryan
July 17, 2020 at 11:01 AM

Thanks for the information Mr. Caulfield.

I just need to know which type of coding scheme did you use on the example that you showed above? and why did you use it.

The reason why I am asking that is because other scholars wrote that in thematic analysis, the type of coding to use was not clearly specified, therefore some used open coding, others combined deductive and inductive coding, others used their own ways of coding. Braun & Clark mentioned the approaches such as deductive or inductive, semantic or latent approaches when developing themes & codes, but they did not specify whether it is open coding e.t.c.

Therefore it will be very helpful to me and maybe other people who are reading this article, if you respond on the question I asked, because it will also help me understand clearly the approach that you used in coding from braun & clark (2006) thematic analysis in your example, and it will help me to decide which approach to use in my study when coding.

Thank you.

Reply

Shona McCombes
Shona McCombes (Scribbr-team)
August 6, 2020 at 3:18 PM

Hi Bryan,

Yes, this can be considered an example of open coding. Open coding is quite a broad term – it simply means attaching labels to parts of the data to describe or classify each part, and it can be applied in many different ways. In the terms of Braun & Clarke's scheme, our example fits with an inductive and semantic approach (because we let the data determine our themes, and we focus on the direct content of what is said).

The example in this article is a simplified version of thematic analysis designed to help understand how the process works. The specific approach you take will depend on the decisions you've made earlier in the research process (for example, based on your research questions and theoretical framework). Many approaches to thematic analysis are not consistently defined throughout the literature; therefore, it's important not just to name an approach, but to clearly explain the choices you made and the process you followed when you're describing your methodology.

I hope that helps!

Reply

Sandra Lou
July 7, 2020 at 2:26 AM

Very helpful indeed!

Reply

Kate H
July 5, 2020 at 12:43 AM

Thank you very much for this. It's very helpful. But, could you please add the sources of all of this information?
That'd be very nice.

Reply

Shona McCombes
Shona McCombes (Scribbr-team)
August 6, 2020 at 1:49 PM

Hi Kate,

The step-by-step process outlined here was developed by Virginia Braun and Victoria Clarke – the article has now been updated with links to the main sources. I hope that helps!

Reply

Ntombekhaya
June 28, 2020 at 5:36 PM

Greetings

This assisted me a lot. Thanks very much.

Reply

sabrina
June 14, 2020 at 12:10 PM

SUPER HELPFUL THANKYOU

Reply

Chris
June 4, 2020 at 12:27 AM

Really helpful, much appreciated!

Reply

Ioakeim
May 18, 2020 at 2:28 PM

Very helpful!!

Reply

Annie
December 14, 2019 at 4:43 AM

This has been very informative and quite helpful in my grad school research process. Thank you!

Reply

Souhail
December 12, 2019 at 11:46 PM

Thank you, this article was helpful, I'd say, if you talk about how the ideas and themes are frequent and to decide frequency, the article would be much more rounded. Thanks

Reply

Patience
August 13, 2020 at 3:03 PM

This was helpful. However am still struggling identifying themes for my research questions. Would you mind assist

Reply

Sasha
December 9, 2019 at 1:35 PM

hi~ this was super helpful :) thank you so much

Reply

Comment or ask a question.

Please click the checkbox on the left to verify that you are a not a bot.