Wikipedia Deep Dive

Grounded theory

14 min read

In 1965, two sociologists were watching people die.

Barney Glaser and Anselm Strauss had embedded themselves in hospitals to study how patients, families, and medical staff navigated the process of dying. What they discovered wasn't just insight into mortality—it was a fundamentally different way of doing science. Instead of arriving with hypotheses to test, they let the data speak first. Patterns emerged. Theories crystallized from the ground up, not from the ivory tower down.

They called their approach "grounded theory," and it would go on to transform fields from psychology to business management. But more importantly for our purposes, it offers a powerful mental model for anyone trying to make sense of messy, human-generated data—including those of us working with large language models.

The Scientific Method, Inverted

Traditional scientific research follows a familiar script. You start with existing theories. You derive hypotheses from those theories. You design experiments to test whether your hypotheses hold up. Then you collect data and see if reality matches your predictions.

Grounded theory flips this sequence entirely.

You begin by collecting data—usually qualitative data like interviews, observations, or documents. As you review this raw material, patterns start to emerge. Concepts bubble up from the text itself. You tag these concepts with codes, short labels that capture recurring ideas. As you gather more data, you group related codes into categories. Those categories eventually become the scaffolding for a new theory.

The contrast is stark. Traditional science is deductive: you start with general principles and work toward specific predictions. Grounded theory is inductive: you start with specific observations and work toward general principles. It's the difference between asking "Does my theory explain what I'm seeing?" and asking "What theory would best explain what I'm seeing?"

Why Dying Patients Changed Everything

The book that emerged from Glaser and Strauss's hospital research, Awareness of Dying, did something remarkable for its time. It legitimized qualitative research.

By the 1960s, the social sciences had developed a serious case of physics envy. Quantitative methods—surveys, statistics, controlled experiments—had accumulated so much prestige that qualitative work was increasingly dismissed as soft, unrigorous, even unscientific. Researchers who wanted to sit in hospitals and listen to people talk about death were seen as doing something less than real science.

Glaser and Strauss pushed back hard. Their 1967 follow-up book, The Discovery of Grounded Theory, made three arguments. First, the gap between social science theory and actual empirical evidence had grown too wide; theories needed to be anchored more firmly in what researchers could actually observe. Second, there was a coherent logic to building theory from data rather than testing pre-existing frameworks. Third, and most critically, carefully conducted qualitative research deserved to be taken seriously.

That third point was the real provocation. And it worked. From medical sociology to psychiatry to education to business management, grounded theory spread into dozens of disciplines. Researchers finally had a methodological framework that honored the messiness of human experience while still meeting rigorous standards.

Two Minds, One Method

The collaboration between Glaser and Strauss worked partly because they came from such different intellectual traditions.

Glaser had trained in the positivist tradition, which emphasizes systematic observation, precise measurement, and the careful classification of phenomena. He brought a rigorous approach to coding—the process of labeling chunks of qualitative data with descriptive tags. His instinct was to create structure: codes should be organized into categories, categories into hierarchies, hierarchies into theories. Every step should be methodical and traceable.

Strauss, on the other hand, had steeped himself in symbolic interactionism. This is a branch of sociology focused on how people create meaning through their interactions with each other. For symbolic interactionists, reality isn't simply "out there" waiting to be discovered. Instead, humans actively construct their understanding of the world through the symbols they share—language, gestures, cultural artifacts. Strauss brought an appreciation for complexity, for the richness of lived experience, for the way social processes unfold over time.

Together, these two perspectives created a methodology that was both systematic and sensitive to human meaning-making. Grounded theory aims to understand how individuals interpret their circumstances and how those interpretations shape their behavior. It looks for the interrelationship between meaning and action.

What the Researcher Actually Does

So what does grounded theory look like in practice? The researcher typically begins with a broad question—or sometimes just with data, collected before any specific question has been formulated. The key discipline is to avoid imposing preexisting frameworks on what you're seeing.

Let's say you're interviewing software engineers about how they evaluate whether their machine learning systems are working correctly. You sit down with your first transcript. You read it line by line, marking passages that seem significant. You give each passage a short code—a label that captures its essence. "Comparing outputs to gold standard." "Gut-checking for reasonableness." "Worried about edge cases."

As you work through more interviews, you start noticing that some codes cluster together. Several engineers mention variations of "sanity checking"—quick informal tests they run to catch obvious failures. That cluster might become a category. Other codes point toward "systematic benchmarking"—more formal evaluation against established datasets. Another category.

You keep collecting data and refining your categories. Eventually, you notice something deeper: perhaps there's a fundamental tension between quick informal validation and rigorous systematic testing. Engineers navigate this tension differently depending on their context. That observation might become the core of an emerging theory about how practitioners balance speed against rigor in machine learning evaluation.

Throughout this process, you're writing memos—running notes about the concepts you're developing, the connections you're seeing, the questions that arise. Memoing is the bridge between raw data and finished theory. It's where you think on paper, working out what your codes mean and how they relate to each other.

Incidents, Not People

Here's a subtle but important distinction. In most behavioral research, the unit of analysis is the individual person or patient. You're studying people and their characteristics.

In grounded theory, the unit of analysis is the incident—a specific moment, event, or passage where something relevant happens. A single interview might contain dozens of incidents. Across your study, you might analyze hundreds of them.

This shift matters because grounded theory is fundamentally about finding patterns across incidents. When you compare incident to incident, concepts emerge that transcend any individual participant. You're not trying to characterize particular people; you're trying to identify the underlying processes and concerns that appear across many different situations.

The goal, ultimately, is to generate concepts that explain how people address their central concerns regardless of specific time and place. These concepts become building blocks for hypotheses. The hypotheses become constituents of theory.

The Core Variable

As analysis progresses, grounded theory researchers look for what's called the core variable—the central concept that explains most of what participants are concerned about, with as much variation as possible but using as few properties as necessary.

Finding the core variable is something like finding the spine of a book. It's the organizing principle that holds everything else together. All the other categories and concepts relate back to it. Once you've identified a tentative core, you shift into selective coding: you focus your data collection and analysis specifically on material relevant to that core, setting aside codes that don't connect to it.

One common type of core variable describes what researchers call a "basic social process"—a pattern that accounts for how things change over time within the domain you're studying. This might be something like "learning to live with uncertainty" or "negotiating professional boundaries" or "managing stakeholder expectations." The process unfolds in stages, responds to context, and shapes behavior in predictable ways.

Truth, Fit, and Workability

Grounded theorists don't claim to be searching for "truth" in the way that word is sometimes used in scientific discourse. They're not trying to discover laws of nature or prove hypotheses beyond doubt.

Instead, they're trying to conceptualize what has been happening in the lives of their study participants. The standards for evaluating grounded theory are different from traditional research criteria. Rather than asking about statistical significance or internal validity, grounded theorists ask about fit, relevance, workability, and modifiability.

A theory fits when its concepts connect closely to the incidents it's meant to represent. This fit depends on how thoroughly the researcher has compared incidents to concepts throughout the analysis. A theory is relevant when it addresses the genuine concerns of participants—not just concerns that academics find intellectually interesting, but concerns that actually matter to the people being studied.

A theory works when it explains how participants address their problems and related issues. It should illuminate practice, not just describe it. And a theory is modifiable: when new data contradict existing categories, the theory should be revised rather than defended. Grounded theory is meant to evolve as understanding deepens.

All Is Data

One of the more radical claims of grounded theory methodology is the principle that "all is data." This means exactly what it sounds like. Everything the researcher encounters while studying a phenomenon counts as potential data—not just formal interviews or structured observations, but casual conversations, newspaper articles, television shows, email threads, conference presentations, even overheard discussions at coffee shops.

This expansive view of data sources reflects grounded theory's origins in the study of how people make meaning in their everyday lives. Meaning-making happens everywhere, not just in research settings. If you're trying to understand how software engineers think about testing machine learning systems, you might learn as much from a heated thread on a programming forum as from a formal interview.

The principle also underscores that the researcher's own ideas and experiences are data too. The memos you write, the hunches you follow, the connections you notice—all of this feeds into theory development. Grounded theory doesn't pretend that researchers are neutral recording devices. It acknowledges that conceptualization is a creative, interpretive act.

The Split

After their foundational collaboration, Glaser and Strauss went their separate ways methodologically. They developed different emphases, different techniques, and occasionally quite different views about what grounded theory should be.

Strauss, working with Juliet Corbin, developed a more structured approach with specific coding procedures. They introduced "axial coding"—a systematic way of reassembling data after initial open coding by explicitly mapping out conditions, contexts, strategies, and consequences. Their version of grounded theory provided detailed guidance for each step of analysis.

Glaser pushed back against what he saw as over-proceduralization. He worried that too much structure would force researchers to impose frameworks rather than letting concepts emerge naturally. His approach emphasized theoretical coding—developing integrative concepts that weave fragmented codes into coherent hypotheses—but remained more open about how researchers should get there.

This split has led to various "schools" of grounded theory, with ongoing debates about which approach is more faithful to the original vision or more useful in practice. For newcomers, the disputes can be confusing. But the core insight remains consistent across camps: theory should emerge from sustained engagement with data, not from armchair speculation.

Why This Matters for Evaluating Language Models

If you're working with large language models, you might wonder why a methodology developed by sociologists studying dying patients has any relevance to your work.

Consider what you're dealing with. Language models are non-deterministic—they won't give the same answer twice, which makes traditional software testing approaches inadequate. Their outputs are qualitative—generated text that requires interpretation, not just numerical metrics. And understanding whether they're working requires grasping how users actually experience them, not just measuring abstract performance benchmarks.

These are exactly the conditions where grounded theory shines.

When you're building evaluations for language model systems, you're essentially trying to develop theories about what "good output" looks like for your particular use case. You might start by collecting examples of model responses. You code them: "accurate but verbose," "concise but missed nuance," "hallucinated a fact," "perfect for the user's apparent need." Categories emerge. You develop a theory about the dimensions that matter most for your application.

The grounded theory mindset encourages you to let your evaluation criteria emerge from actual outputs rather than imposing predetermined rubrics. It reminds you that understanding requires iteration—comparing incident to incident, refining concepts, staying open to surprise. It warns against forcing data into existing frameworks when the data might be telling you something new.

The Constant Comparative Method

At the heart of grounded theory practice is what Glaser and Strauss called the constant comparative method. It's deceptively simple: you continuously compare new data to existing concepts, existing concepts to each other, and emerging theories to all the data you've collected.

This comparison happens at every stage. When you code a new incident, you ask: Is this the same as incidents I've already coded, or different? If different, does it require a new code or a modification of an existing one? When you develop a category, you ask: How does this category relate to other categories? Are some more fundamental than others? When you formulate a tentative theory, you ask: Does this theory account for the negative cases—the incidents that don't fit the pattern?

Negative case analysis is particularly important. Rather than ignoring or explaining away data that contradicts your emerging theory, you actively seek out such data. If your theory can't account for the negative cases, it needs revision. This discipline protects against confirmation bias—the natural human tendency to notice evidence that supports what we already believe.

Emergence Versus Forcing

The deepest principle in grounded theory is the commitment to emergence over forcing. Concepts should arise from data, not be imposed upon it. Theories should be discovered, not invented and then defended.

This is harder than it sounds. Every researcher brings preconceptions, prior reading, professional training, and personal experiences to their work. The temptation to see patterns that match existing frameworks is constant. Grounded theory doesn't pretend these influences can be eliminated, but it provides techniques for managing them—delayed literature review, rigorous memoing, constant comparison, negative case analysis.

The goal isn't to approach data with an empty mind, which is impossible anyway. The goal is to remain genuinely open to being surprised by what the data reveal. The best grounded theories are ones that make the researcher think, "I never would have predicted this, but now that I see it, it makes perfect sense."

Beyond Either/Or

Grounded theory is sometimes positioned as the opposite of quantitative research, but this framing is too simple. Grounded theory can incorporate quantitative data when relevant. More importantly, grounded theory and traditional hypothesis-testing research serve different purposes that can complement each other.

Grounded theory excels at generating hypotheses and building theoretical frameworks. Traditional research excels at testing whether specific hypotheses hold under controlled conditions. A mature research program might use grounded theory to develop theories about a phenomenon and then use quantitative methods to test predictions derived from those theories.

The key is matching method to question. When you don't yet know what the important variables are, grounded theory helps you find them. When you have specific predictions to test, other methods may be more appropriate. The worst mistake is applying a method that doesn't fit your actual situation—forcing deductive testing on questions that require inductive exploration, or vice versa.

Learning to See

Perhaps the deepest lesson from grounded theory is that understanding takes time and patience. You can't rush emergence. You have to sit with data, compare incidents repeatedly, let patterns reveal themselves gradually. The methodology rewards sustained attention over quick conclusions.

For those of us accustomed to the pace of software development—ship fast, iterate quickly, move on to the next thing—this can feel uncomfortable. But there's wisdom in the slowness. Complex phenomena don't yield their secrets to impatient observation. Sometimes you have to watch people die in hospitals for months before you understand what's really happening.

Glaser and Strauss gave us a rigorous framework for this kind of patient inquiry. They showed that careful qualitative analysis can be just as systematic as statistical testing, just as productive of genuine knowledge. And they demonstrated that the most powerful theories often come not from abstract speculation but from sustained, humble attention to what's actually going on.

That's a lesson worth remembering, whether you're studying dying patients, software engineers, or the curious outputs of large language models.