top of page

Generative AI and Qualitative Research: When the Narrative of Efficiency Meets the Boundaries of Interpretation

The contrast between human interpretive work and computational code reflects the tension between qualitative meaning-making and generative AI’s promise of efficiency
The contrast between human interpretive work and computational code reflects the tension between qualitative meaning-making and generative AI’s promise of efficiency

Generative AI is increasingly used in qualitative research, yet its promise of efficiency raises fundamental questions about interpretation, meaning-making, and the role of the researcher. This article is based on the academic presentation "Interpreting Qualitative Data with AI: Pitfalls and Potential" by Duc Nguyen and Catherine Welch (November 26, 2025). It aims to systematically examine the transformation, challenges, and reflections triggered by generative artificial intelligence in the field of qualitative research.


The Technological Landscape: Promises of Revolution and Conceptual Confusion Under the Narrative of Efficiency

We are in the midst of a generative AI-driven revolution, whose core promise is to achieve deep analysis at an unprecedented speed. The current technological ecosystem presents a diverse and dynamic picture: traditional Computer-Assisted Qualitative Data Analysis Software (CAQDAS) providers such as ATLAS.ti, MaxQDA, and NVivo are infusing their tools with "intelligence" by integrating OpenAI's technology. At the same time, a wave of AI-native analysis platforms like CoLoop, QInsights, and AILYZE have emerged as new entrants. Furthermore, individual researchers exploring the creation of customized GPTs or directly using general-purpose Large Language Models (LLMs) such as ChatGPT, Copilot, and Gemini constitute another vital part of this ecosystem. It is noteworthy that attempts at automated analysis based on machine learning (e.g., Qualrus in 2002) have existed for a long time, but the current wave centered on generative AI far exceeds previous ones in both scale and ambition.


Behind this fervor lies a widespread enthusiasm among researchers, primarily driven by efficiency. This enthusiasm focuses on automated thematic coding, scaling up qualitative research designs, and is often spearheaded by computer scientists or quantitative researchers. However, a fundamental conceptual confusion lurks within: equating textual analysis with qualitative analysis. For instance, some research (Quevedo et al., 2025) proposes using LLM-based algorithms to efficiently explore and analyze large online datasets while preserving deep contact with the data and maintaining the richness of qualitative analysis. This formulation reflects an optimistic imagination that directly links large-scale automated processing with the depth and richness cherished by qualitative inquiry. This confusion is also mirrored in academic publishing, where journal editors have proposed ideal-typical roles for generative AI as a research assistant, data analyst, or even co-author (Glaser & Gehman, 2024), and argue that embracing AI has become an imperative for maintaining disciplinary relevance and advancing knowledge (Andrews et al., 2026).


An Analysis of Essence: The Technical Reality and Inherent Flaws of the Statistical Model

To understand the potential and limitations of generative AI, one must first clarify its technical nature. Generative AI is often conflated with other AI technologies. As Bender and Hanna (2025) point out, "AI" is largely a marketing term. Generative AI tools represented by ChatGPT and Gemini are based on Large Language Models—a specific class of statistical language models that predict the next token. Trained on massive datasets, they can generate synthetic content resembling human creativity, but their outputs should not be equated with human-like reasoning or intelligence. These outputs should be understood as the product of a sophisticated statistical pattern-matching exercise that reproduces and recombines encoded patterns. Crucially, the statistical model does not store any word strings or answers, and thus the "outputs" are not "results" but samples drawn from a probability distribution. This means these models do not understand text and cannot perform activities requiring comprehension, such as interpreting, analyzing, extracting meaning, or identifying themes.


Separating marketing claims from research reality reveals some inherent issues with this statistical model: insufficient explainability, inherent biases, and an inability to achieve reliability and reproducibility. Crucially, these are features of the model, not bugs. Subsequent technological developments, such as Retrieval-Augmented Generation (RAG), agents, and world models, cannot fundamentally solve these problems because such models are not designed to establish facts. As Thornton (2023) states, an output "does not need to be "correct", it only needs to "appear correct". This explains why OpenAI developers warned of the need to take great care when using language model outputs and why Google CEO Sundar Pichai also cautioned against blindly trusting AI as it is prone to some errors. The essence of the relationships between the researcher, the predictive statistical language model, and empirical data is roughly that the researcher obtains synthetic text outputs via the model through prompts, rather than engaging in a direct dialogue with the object of analysis, namely the empirical data.


Practical Pitfalls: The Illusion of Efficiency, the Oracle Effect, and the Verification Loop

In practice, the application of generative AI in qualitative analysis mainly revolves around two use cases: automation and augmentation. However, both come with significant pitfalls. Automated outputs are often presumed to be neutral and objective due to their technological origins, the academically formatted text they generate is easily endowed with an unwarranted authority, leading to the "oracle effect"—the tendency to attribute to LLMs an intelligence comparable to or even surpassing that of humans. We unconsciously anthropomorphize chatbots, assuming communicative intent behind their outputs, and perceive them as capable research assistants, colleagues, or even friends. Each interaction produces an ever-expanding body of synthetic content that requires validation, rendering any promise of efficiency gains an illusion. Faced with a sheer volume of synthetic outputs, research scalability also becomes unattainable.


The capabilities of existing tools have clear boundaries. Taking NVivo 15's AI Assist as an example, it primarily offers text summarization and automated child code suggestions. However, it is crucial to note that these are only descriptive codes, not interpretive or analytical ones. As its trainer Ben Meehan stated, it hasn't got into the high-level analysis. Other mainstream CAQDAS providers are equally candid. MAXQDA states that the outputs from its AI functions are meant to be of interpretive assistance, not as a source of truth. ATLAS.ti directly warns users that AI coding might produce biases, hallucinations, and other undesired outputs.


Testing dedicated "Qual-AI" platforms further reveals the problems. Performing theme analysis on a single interview transcript often yields broad, superficial descriptive themes lacking depth and insight. In conversational analysis, when asked a specific question, the model tends to generate another vague summary rather than providing a precise answer. To judge whether the answer is relevant and accurate, the researcher must already be intimately familiar with the data and spend additional effort verifying the output. Ironically, interacting with the chatbot does not bring the researcher closer to the data but instead inserts a layer of synthetic text that requires continuous interpretation and verification, ultimately potentially trapping the researcher in an endless loop of validating model outputs, distancing them from the true object of analysis.


Return and Reflection: Revisiting the Soul of Qualitative Research Amid the Technological Wave

The true potential of this technological moment may lie not in realizing the utopia of automated analysis but in forcing the research community to rediscover and anchor the interpretive nature of qualitative research. Qualitative research is a process of meaning-making, not information processing. It demands that the researcher be relational, embodied, context-sensitive, culturally or temporally embedded, ethical, reflexive, experiential, empathetic, critically introspective, positioned, and engaged. These human qualities constitute the irreducible core of qualitative analysis.


This stance is echoed by methodological scholars. For instance, the founders of Reflexive Thematic Analysis have explicitly stated that their method is more than data labelling, thereby drawing a clear line against approaches that reduce analysis to automated label generation. The challenge posed by generative AI thus becomes a crucial moment for reflection: Do we allow a technology that is essentially statistical pattern-matching to redefine our pursuit of understanding the complexity of human experience? The answer lies in making wise distinctions and choices. Technology can serve as an auxiliary tool for certain managerial tasks but must never replace the researcher's core role as the interpreting subject. Cultivating critical AI literacy, understanding its technical foundations and limitations, and adhering to research ethics throughout. especially regarding data confidentiality, is more important than blindly embracing efficiency. Ultimately, the legacy of this technological shockwave may be helping us more firmly reaffirm that the depth, humanistic quality, and soul of qualitative research are its most valuable assets and the reason for its irreplaceability.




Link to the video of the presentation:



References:


Bender, E. M.  & Hanna, A. (2025). The AI Con: How to Fight Big Tech’s Hype and Create the Future We Want.


Garcia Quevedo, D., Glaser, A., & Verzat, C. (2025). Enhancing Theorization Using Artificial Intelligence: Leveraging Large Language Models for Qualitative Analysis of Online Data. Organizational Research Methods, 29(1), 92–112. https://doi.org/10.1177/10944281251339144


Glaser, V. L., & Gehman, J. (2023). Chatty actors: generative AI and the reassembly of agency in qualitative research. Journal of Management Inquiry. (PDF) Chatty Actors: Generative AI and the Reassembly of Agency in Qualitative Research


Thornton I. (2023). A special delivery by a fork: Where does artificial intelligence come from? New Directions for Evaluation, 23–32. https://doi.org/10.1002/ev.20560








 
 
 

Comments


This initiative is supported by the following organizations:

  • Twitter
  • LinkedIn
  • YouTube
logo_edited.png
bottom of page