You’re staring at pages of interview transcripts, focus group notes, or open-ended survey responses, and the overwhelming question hits: “How on earth do I make sense of all this?” We’ve all been there—qualitative data can feel like a chaotic mess of words when you first dive in. Unlike quantitative research where you can punch numbers into SPSS and generate neat tables, thematic analysis demands that you actively engage with your data, interpret meaning, and construct themes that genuinely represent what your participants are saying. It’s intellectually demanding work, but here’s the truth: once you understand the systematic process behind thematic analysis, it transforms from an intimidating mountain into a manageable, even rewarding, research journey.
Thematic analysis has become the go-to method for qualitative research across disciplines—from psychology and sociology to health sciences and business studies. Its popularity stems from its flexibility and accessibility, particularly for researchers new to qualitative methods. But flexibility doesn’t mean “anything goes.” Proper thematic analysis follows a rigorous, transparent process that ensures your findings are credible, trustworthy, and defensible when your supervisor inevitably asks, “But how did you actually get to these themes?”
This guide walks you through thematic analysis step-by-step, complete with practical coding examples you can adapt for your own research. We’ll cover everything from understanding what codes and themes actually are (spoiler: they’re not the same thing), to navigating the six phases of analysis, to avoiding the common pitfalls that trip up even experienced researchers. Whether you’re coding your first interview or refining themes for your dissertation, this comprehensive walkthrough provides the practical tools and theoretical grounding you need.
What Exactly is Thematic Analysis and Why Should You Use It?
Thematic analysis is a qualitative research method that identifies, analyses, and reports patterns (themes) within data. At its core, it’s about systematically organising and interpreting qualitative information to uncover meaningful patterns that answer your research questions. Think of it as a way of seeing and making sense of shared meanings and experiences across a dataset.
Here’s what makes thematic analysis particularly valuable for student researchers: it’s theoretically flexible, meaning you can use it within various epistemological frameworks without being locked into a specific theoretical paradigm like grounded theory or phenomenology. It’s also more accessible than many other qualitative approaches, making it ideal when you’re learning qualitative research methods for the first time.
Coding is the foundation of thematic analysis. A code is simply a label that captures the meaning of a piece of data—it’s the smallest meaningful unit of analysis. Codes are systematic categorisations of excerpts in your qualitative data that help you identify themes and patterns. Importantly, codes are NOT the same as themes. Codes are individual labels you apply to fragments of data, whilst themes are larger concepts that capture meaningful patterns and represent something significant about the data in relation to your research questions.
Understanding this distinction matters because it prevents one of the most common mistakes in thematic analysis: presenting a list of codes and calling them “themes.” Themes are interpretive constructs that you actively generate through analysing patterns across codes—they’re not just topic summaries or category labels.
Why bother with systematic coding? Beyond the obvious requirement that your methodology needs to be rigorous, proper coding ensures data validity through transparent analysis that other researchers can review. It increases credibility by demonstrating your analysis was undertaken systematically rather than cherry-picking quotes that support your predetermined ideas. It decreases bias by making you consciously aware of potential biases in your data analysis process. And critically, it helps you accurately represent your participants by ensuring your analysis genuinely reflects your participant base rather than over-representing particularly articulate individuals or opinions you personally agree with.
How Do You Choose Between Inductive, Deductive, and Mixed Coding Approaches?
Before you start slapping codes on data, you need to decide your fundamental approach. This isn’t just a methodological checkbox—it shapes how you engage with your data and what kind of themes you’ll generate.
Deductive coding (top-down approach) means you start with pre-established codes developed BEFORE engaging with your data. These codes come from your research questions, existing literature, or previous research frameworks, and you systematically apply them to your new data. For instance, if you’re evaluating a student wellbeing programme, you might pre-establish codes like “mental health support,” “academic stress,” and “peer relationships” based on your programme evaluation framework.
The advantage? It’s efficient and guarantees your areas of interest get coded. The disadvantage? You risk confirmation bias and might miss valuable insights that fall outside your predetermined focus. It’s like searching for something with a torch—you see what you illuminate but miss everything in the darkness.
Inductive coding (bottom-up approach) means your codes develop FROM the data itself with no pre-existing codebook. You derive codes as you read and analyse, allowing themes to emerge organically. This iterative process involves breaking your dataset into samples, reading carefully, creating codes as ideas appear, then rereading and refining those codes through multiple passes.
Inductive coding provides a comprehensive, unbiased look at themes and proves particularly valuable for exploratory research on new topics. However, it’s time-intensive and requires patience through multiple iterations. You might feel like you’re drowning in codes initially, but trust the process—organisation emerges.
The hybrid or abductive approach combines both strategies, and here’s a secret: this is what most researchers actually do in practice, even if they don’t explicitly say so. You start with a priori codes from existing theory (deductive), but remain open to adding new codes as they emerge from your data (inductive). This provides structure whilst preserving flexibility—the best of both worlds.
For student researchers, the hybrid approach often works best. It gives you enough structure to avoid feeling completely lost whilst allowing genuine discovery. Your literature review naturally provides some initial concepts to look for, but you’re not forcing data into predetermined boxes.
What Are the Six Phases of Braun and Clarke’s Thematic Analysis?
Victoria Braun and Virginia Clarke’s six-phase framework has become the gold standard for conducting thematic analysis. These phases aren’t rigid sequential steps—you’ll loop back and forth as your analysis develops. Understanding this non-linear nature prevents the frustration when you realise you need to revisit earlier phases.
Phase 1: Familiarising Yourself With the Data
Immerse yourself completely in your data through multiple readings. If you’ve conducted interviews yourself, transcription becomes part of familiarisation (allow roughly 15 minutes transcription time per 5 minutes of audio—yes, it takes that long). Even if someone else transcribed, read and re-read your entire dataset from start to finish. Number the lines in your transcripts for easy reference later.
During familiarisation, take initial notes—both analytical observations and intuitive reactions. What strikes you? What surprises you? What patterns might be emerging? These early impressions matter. Use memos to document your thoughts and reflect on emerging interpretations. The purpose here is understanding the depth and breadth of content before you start coding.
The challenge? It’s tempting to rush this phase and jump straight to coding. Resist. Proper familiarisation makes the actual coding process faster and more insightful because you understand the whole dataset’s context.
Phase 2: Generating Initial Codes
Now you systematically code interesting features across your entire dataset. Break the data into smaller, meaningful pieces with descriptive labels (codes). Your first-pass codes should be broad, loose, and provisional—they’ll be refined later. Code everything that seems even remotely interesting; you can eliminate irrelevant codes later more easily than you can remember what you skipped.
Here are the main coding methods you can use:
In Vivo Coding uses participants’ own words as codes. If someone says “I just felt completely overwhelmed,” your code might be “felt completely overwhelmed.” This stays close to participants’ intent and proves particularly useful in cross-cultural research.
Process Coding uses action-based codes with gerunds (-ing words) like “managing workload,” “seeking support,” or “avoiding confrontation.” This captures actions and behaviours, making it valuable for understanding workflows and activity patterns.
Descriptive Coding summarises content into single words or short phrases. You might code a segment discussing campus libraries simply as “library facilities” or “study spaces.” This proves especially useful for organising large datasets by topic.
Values Coding captures participants’ worldviews, values, attitudes, and beliefs. Look for segments where participants express what they feel, think, or need. Code phrases like “commitment to work-life balance” or “valuing academic achievement over social life.”
The output from this phase? A comprehensive set of initial codes with relevant data extracts collated for each code. Don’t worry if it feels messy—that’s normal.
Phase 3: Searching for Themes (Generating Initial Themes)
Now you shift from the micro level (individual codes) to the macro level (overarching themes). Collate your codes into potential themes by looking for similarities, connections, and patterns. Themes should encompass multiple codes, and you might develop sub-themes for complex areas.
Here’s crucial language: you are GENERATING themes, not “discovering” them. Themes don’t passively wait in your data like Easter eggs. You actively construct them through interpretation. Research is a meaning-making process, and themes reflect your analytical decisions about what matters in the data.
Think of themes like planets with codes as moons orbiting them. For instance, individual codes like “communication challenges,” “team tension,” and “productivity changes” might orbit a broader theme of “Shifting Dynamics in Workplace Relationships.”
Phase 4: Reviewing Themes
Quality control time. Conduct a two-level review:
Level 1: Review the coded data extracts within each theme. Do they fit together coherently? Does this theme make internal sense?
Level 2: Review theme validity against your entire dataset. Read through all your data again with your themes in mind. Does each theme make sense holistically? Is it well-supported across the dataset?
Consider splitting themes containing too much variation. Consider merging themes that overlap significantly. Consider discarding themes with insufficient data support. Ask yourself: Does this theme capture something important about the data? Does it relate to my research questions? Does it contain sufficient supporting data? Is it too broad or too specific?
Don’t be discouraged if you need to loop back to Phases 2 or 3. This iterative process strengthens your analysis.
Phase 5: Defining and Naming Themes
Define each theme clearly—you should be able to describe what each theme is about in one to two sentences. What’s the essence it captures? If you struggle to name or describe a theme, it probably needs refinement.
Choose names that reflect meaningful concepts, not just topic summaries. “Blurred Boundaries Between Work and Home Life” captures more analytical depth than simply “Remote Work Flexibility.”
Creating clear definitions often prompts you to reconsider or refine themes. That’s a feature, not a bug. The struggle to articulate means you’re doing proper analytical work.
Phase 6: Writing Up (Producing the Report)
Provide detailed descriptions of each theme with illustrative quotes from your data. But don’t just string quotes together—provide interpretation and analysis between quotes. Explain WHY you’ve interpreted data in particular ways. Discuss broader implications of your findings. Relate your analysis back to your research questions and relevant literature.
Select vivid, compelling examples that genuinely exemplify each theme. Your write-up should present a clear, logical account that demonstrates themes are relevant to your dataset with sufficient evidence.
How Do You Actually Code Qualitative Data? A Practical Example
Let’s work through a concrete coding example so you can see these methods in action. Imagine you’re researching university students’ experiences of online learning during the shift to remote education.
Sample Interview Transcript:
“Right, so when lectures moved online, I initially thought it would be easier because I wouldn’t have to commute. But honestly, I found it really hard to stay motivated. Like, I’d start a recorded lecture and then get distracted by my phone or housemates. The flexibility was great in theory, but I needed more structure than I realised. I did appreciate being able to rewatch lectures though—that helped when I didn’t understand something the first time.”
Applying Different Coding Methods:
In Vivo Coding (using participant’s exact words):
- “really hard to stay motivated”
- “get distracted”
- “needed more structure”
- “appreciate being able to rewatch”
Process Coding (action-oriented with gerunds):
- “adapting to online learning”
- “experiencing distraction”
- “managing motivation”
- “utilising lecture recordings”
Descriptive Coding (summarising topics):
- “motivation challenges”
- “flexibility issues”
- “learning benefits”
- “environmental distractions”
Values Coding (capturing beliefs/values):
- “valuing structure in learning”
- “appreciating lecture accessibility”
You’d continue this process across all your interviews, building up codes systematically. Notice how the same data segment can be coded differently depending on your method and analytical focus—there’s no single “correct” code. What matters is consistency and that your codes genuinely capture something meaningful in relation to your research questions.
As you code more interviews, you’d notice patterns. Perhaps “motivation challenges,” “self-discipline struggles,” and “time management difficulties” all cluster together, suggesting a broader theme around “Self-Regulated Learning Challenges.” Meanwhile, “appreciating lecture accessibility,” “benefiting from flexible scheduling,” and “accessing materials repeatedly” might form a theme around “Enhanced Learning Resource Control.”
| Coding Stage | Example Elements | Analysis Focus |
|---|---|---|
| Initial Codes | “motivation challenges,” “distraction,” “rewatch lectures,” “flexibility issues” | Labelling specific data segments systematically |
| Focused Codes | “self-regulation difficulties,” “learning resource benefits,” “structural needs” | Refining and consolidating initial codes |
| Themes | “Self-Regulated Learning Challenges,” “Enhanced Control Over Learning Resources” | Identifying broader patterns and meaning |
| Sub-themes | Under “Self-Regulated Learning”: “motivation management,” “environmental distraction,” “time structure needs” | Adding nuance and complexity to themes |
What Role Does Reflexivity Play in Quality Thematic Analysis?
Here’s something they don’t always emphasise enough in methodology textbooks: you, the researcher, are a research instrument. Your background, experiences, assumptions, and theoretical perspectives inevitably shape how you interpret data. Reflexivity isn’t about eliminating this influence (impossible) but accounting for it transparently.
Braun and Clarke specifically named their approach “reflexive thematic analysis” to emphasise this component. When you generate themes, you must narrate the assumptions driving your interpretation. Why did you interpret data in that particular way? What theoretical, social, cultural, or political contexts influenced your analytical decisions?
Maintain a reflexive journal documenting your thoughts, reactions, and decision-making throughout your research process. When you notice yourself strongly agreeing or disagreeing with a participant, note it. When you make analytical decisions, document your reasoning. This creates an audit trail demonstrating the transparency of your research process.
For instance, if you’re researching student experiences of academic pressure and you’ve personally experienced anxiety around assessment, acknowledge how this might make you particularly attuned to anxiety-related codes. It doesn’t invalidate your analysis—it makes it transparent. Reflexivity enhances trustworthiness by showing you’re consciously aware of potential biases rather than pretending objective neutrality (which doesn’t exist in interpretive research anyway).
How Can You Ensure Your Thematic Analysis is Trustworthy and Rigorous?
Student researchers often worry: “How do I know my analysis is good enough?” Lincoln and Guba’s four trustworthiness criteria provide clear standards:
Credibility (truth value): Do your findings reflect an accurate picture of the phenomenon? Strategies include triangulation (using multiple data sources or methods), prolonged engagement with your data, persistent observation of patterns, and peer debriefing where colleagues review your analytical decisions.
Transferability (applicability): Can readers judge whether your findings apply to similar contexts? Achieve this through thick description—rich, detailed accounts providing sufficient context about your participants, setting, and research conditions. You’re not claiming generalisability, but enabling readers to assess applicability.
Dependability (consistency): Is your research process well-documented? Maintain an audit trail showing your analytical journey. Document your coding decisions, theme development, and refinements systematically. If someone reviewed your process, could they follow your analytical path?
Confirmability (neutrality): Do findings emerge from your data rather than researcher bias? Ground your themes in direct quotes from participants. Use peer debriefing to check interpretations. Practice reflexivity to acknowledge your influence. The goal isn’t eliminating researcher influence but demonstrating findings are firmly rooted in data.
Practical strategies strengthening trustworthiness include creating a detailed codebook tracking each code’s definition and when to apply it, conducting regular team meetings if you’re working collaboratively, and seeking feedback from supervisors at key analytical stages. When you encounter data contradicting your emerging themes (negative cases), don’t ignore it—discuss it. This demonstrates analytical nuance and honesty.
What Common Mistakes Should You Avoid in Thematic Analysis?
Even experienced researchers fall into certain traps. Here’s what to watch for:
Mistaking topic summaries for themes. “Social media use” isn’t a theme—it’s a topic. “Social media as both connection and isolation for international students” captures a theme’s interpretive depth.
Using “themes emerged” language. This implies themes passively appear rather than being actively constructed through your interpretive work. Say “I developed themes” or “I generated themes” instead.
Overcoding with too many specific codes. If you have 200 codes for 10 interviews, you’ve probably gone too specific. Balance granularity with manageability.
Undercoding with too few broad codes. If everything fits under five massive codes, you’re missing nuance and complexity.
Inconsistent coding over time. Your understanding evolves as you code, leading to “definitional drift” where you apply codes differently at the end versus the beginning. Combat this by regularly reviewing earlier coding decisions and maintaining clear code definitions.
Forcing data into predetermined themes. Particularly risky with deductive approaches. If data doesn’t fit your expected themes, don’t force it. Be willing to adapt your analytical framework.
Presenting themes with insufficient evidence. Each theme needs multiple supporting extracts across different participants. One vivid quote doesn’t make a theme.
What Tools and Software Can Help With Coding?
You’ve got options ranging from low-tech to sophisticated software:
Manual coding with printed transcripts, highlighters, and scissors allows tactile engagement with data. It’s intuitive and flexible but unmanageable for large datasets and impossible for remote collaboration.
Microsoft Word using the comment function for codes and separate documents for each code provides more organisation. It’s accessible but still involves tedious copying and pasting.
Excel or Google Sheets with one excerpt per row and code columns enables filtering and searching. This works well if you’re comfortable with spreadsheets and have structured data.
Qualitative Data Analysis Software (CAQDAS) like NVivo, ATLAS.ti, MAXQDA, or Delve offers purpose-built functionality. These tools handle large datasets efficiently, enable demographic filtering, facilitate team collaboration, and provide sophisticated query capabilities. The learning curve varies—some are more intuitive than others—but for dissertation-scale projects, they’re worth it.
Most universities provide free access to at least one CAQDAS package through institutional licences. Take advantage of training workshops offered by your library or research methods centre. The initial time investment learning the software pays dividends when you’re managing 20+ interviews.
AI-assisted tools using natural language processing are emerging but require careful application. They can accelerate initial coding, but human oversight remains critical. Never let automated coding replace your deep engagement with data—use it to complement, not replace, your analytical thinking.
Building Your Analytical Confidence
Thematic analysis step-by-step with example codes isn’t just about following a recipe—it’s about developing your analytical eye and building confidence in your interpretive decisions. The six phases provide structure, but real analytical skill comes from practice, reflexivity, and genuine engagement with what your participants are telling you.
Start small if you’re new to this. Code a single interview thoroughly using different methods to see which feels most natural. Compare your codes with a peer doing the same interview—differences are learning opportunities, not failures. Build your analytical muscles gradually, and don’t expect perfection on your first attempt. Even experienced qualitative researchers continue refining their practice.
Remember that qualitative research celebrates the researcher as an active interpreter, not a neutral recording device. Your unique perspective, informed by reflexivity and systematic methods, is what transforms raw data into meaningful insights. Trust the process, document your decisions, and engage deeply with what participants share.
The journey from overwhelming transcripts to coherent themes isn’t always linear, but it’s immensely rewarding when you construct an analysis that genuinely represents participants’ experiences whilst offering new insights into your research questions. That’s the power of rigorous thematic analysis.



