Koch's Postulates Are a Causality Protocol

A physician in 1884 held himself to a higher standard of causal proof than most engineering teams do today — and the bugs that break his postulates each name the tool to switch to.

Published June 2026 · 10 min read

In the 1880s, a German country doctor turned bacteriologist named Robert Koch wrote down four rules you had to satisfy before you were allowed to claim that a particular germ caused a particular disease. Not correlated with it. Caused it. The rules were demanding on purpose, and clearing all four was real work — you had to find the microbe in every sick subject and no healthy one, grow it by itself in a pure culture, use that pure culture to make a healthy host sick, and then recover the same microbe from your new patient.

Now picture a modern incident review. A service fell over at 14:05. Someone notices a config change went out at 14:03, or spots an alarming line in the logs, or remembers that the database has “always been flaky.” That suspect gets written into the postmortem as the root cause, a fix ships, everyone goes back to work. Zero of Koch's four steps were performed. The suspect was present. It was never isolated, reproduced, or re-confirmed.

Sit with the embarrassing comparison: a physician in 1884, working with petri dishes and lab animals and no idea that DNA existed, held himself to a higher standard of causal proof than most engineering teams hold themselves to today, with distributed tracing and infinite logs and a debugger that can pause the universe. Koch's postulates are a causality protocol. They are worth stealing wholesale. And — this is the part nobody tells you — the famous exceptions that eventually broke the postulates are even more useful than the postulates themselves, because each broken case is a precise diagnostic for a specific kind of bug that will defeat your normal debugging and tell you, if you listen, to change your tools.

The four-step bar for “this caused that”

Koch and his colleague Friedrich Loeffler crystallized the criteria around 1884, on the back of Koch's own triumphs: he had nailed the tuberculosis bacillus in 1882 and would do the same for cholera shortly after. Before this, the cause of a disease was a matter of argument and authority. After it, you had a test. The four postulates, in plain form:

Present in every case. The microbe must be found in every organism suffering the disease, and absent from the healthy.
Isolated in pure culture. You must extract it and grow it on its own, away from everything else in the host.
Reproduces the disease. That pure culture, introduced into a healthy host, must produce the disease.
Re-isolated. You must recover the same microbe from the freshly infected host.

What makes this a protocol rather than a vibe is the order and the burden. Step 1 rules out coincidence. Step 2 rules out “it was tangled up with something else that did the real work.” Step 3 is the killer — it demands you cause the disease on purpose, which is the difference between “found at the scene” and “pulled the trigger.” Step 4 closes the loop so you know it was the same agent the whole way through.

Translate that into debugging and you get a root-cause checklist with actual teeth:

Present in every case: Is your suspect there in every instance of the failure, and absent from healthy runs? If the service sometimes falls over without the config change, or the config change sometimes ships with no incident, you have a correlate, not a cause.
Isolated: Can you reproduce the failure with only the suspect in play, in a clean environment? If you can't make it happen in isolation, you haven't shown it's sufficient.
Reproduces: Does introducing the suspect into a healthy system actually produce the failure? Flip it on deliberately and watch.
Re-isolated: When it fails on your reproduction, is it failing by the same mechanism you diagnosed? If a different path produces the same symptom, your first diagnosis was wrong.

Most “root causes” in most postmortems clear, generously, one and a half of these. They establish presence and stop. That's finding a bacterium near a sick person and hanging it for murder.

The exceptions are the interesting part

Here's the twist that makes Koch's postulates more useful for engineers, not less: they don't always work, the cases where they fail are well documented, and Koch was honest enough to be the first to break them. Each failure mode maps cleanly onto a category of bug that will wreck a naive root-cause hunt.

The asymptomatic carrier — agent present, host fine. Koch himself punctured his own first postulate when he found people carrying cholera, and later typhoid, who were perfectly healthy. The microbe was present; the disease was absent. The iconic case is Mary Mallon — “Typhoid Mary” — a New York cook in the early 1900s who infected dozens of people with typhoid while never being sick a day from it herself. The carrier breaks the clean “present implies guilty” logic completely.

The debugging twin is the latent contributor: code that has been quietly wrong for months or years, sitting in production asymptomatic, until a particular load pattern, race window, or config flip finally makes it express the disease. The operational lesson is sharp and counterintuitive: “that code hasn't changed in a year” is not an alibi. Asymptomatic carriers are real. The line that finally killed you may have been carrying the pathogen the whole time, waiting for the conditions that turned it symptomatic.

The unculturable agent — can't isolate it at all. Koch's second postulate assumes you can grow the thing in a pure culture. Viruses laughed at that; they refuse to grow on their own and need living host cells, which is why early virologists could only describe a “contagious living fluid” they couldn't pin down. Even some bacteria — the agents of syphilis and leprosy among them — still cannot be cultured in a dish. You cannot isolate them, so postulate 2 is simply unreachable.

The twin here is the heisenbug and its cousin, the production-only failure — the bug that vanishes the instant you try to isolate it. You attach a debugger and the race condition evaporates because you changed the timing. You add logging and the corruption stops because you nudged memory layout. You pull the code into staging and it works flawlessly, because staging lacks whatever production has. The bug cannot be cultured in a pure environment, and every attempt to isolate it destroys the very conditions it needs to live.

The polymicrobial infection — no single agent is sufficient. Some diseases are caused by a consortium acting together, where no member alone reproduces the illness. Run Koch's postulates on any single organism and each one comes back innocent, because individually each one is.

The twin is the emergent failure: no component is broken on its own, the bug lives in the interaction — service A's perfectly reasonable retry policy meeting service B's perfectly reasonable rate limit, producing a cascade neither team can find by examining their own code. Single-suspect debugging will clear every component and still leave you with a dead system.

The prion — not even a microbe. And then there's the category error. Prions, the misfolded proteins behind mad-cow disease, cause infection with no organism and no nucleic acid at all; they break every postulate because the entire framework assumed the wrong kind of culprit. The debugging twin is the most humbling incident of all: you are deep in the application code, and the problem was never code. It was configuration, or a data corruption, or a clock skew, or a dependency three layers down. You will never satisfy any postulate about your code, because your code is not the pathogen.

Don't lower the bar — switch the tools

Now the genuinely important move, the one that turns this from a clever analogy into operating advice. When the postulates broke, medicine did not shrug and lower its standard for causality. It kept the standard and changed its instruments.

Unculturable pathogens forced the field to molecular methods. In 1988 Stanley Falkow proposed molecular Koch's postulates: you no longer have to grow the organism in a pure culture — you identify the specific gene responsible for virulence, show that disabling it removes the disease and restoring it brings the disease back. Sequence-based detection let researchers finger pathogens they had never once grown in a dish. The bar for causality stayed exactly as high. The tools moved from the culture plate to the genome.

That is the whole playbook for your hardest bugs. When a failure breaks reproduce-in-isolation, the failure to isolate is not a dead end — it is the diagnostic. “I can't reproduce it in staging” is information: it tells you the bug depends on something staging doesn't have — real traffic shape, real data volume, real concurrency, a real clock, a real downstream. So you stop trying to culture it in a dish and you study it in the living host:

Distributed tracing is your molecular method — the way to follow a single failing request across a dozen services and find the responsible path without ever isolating the organism from the body. When you can't pull the bug out, instrument it where it lives.
Production debugging and observability are in-situ study of the pathogen in its host, because the host is the only place it expresses the disease.
Chaos engineering is postulate 3 done deliberately and safely: introduce the suspected cause into a healthy production system — under a controlled blast radius — and see if the failure appears. It is the closest thing software has to infecting a healthy host on purpose to confirm causation.
Canary deploys are controlled exposure of a small population, the field trial that lets you observe an effect under real conditions without endangering everyone.
And for the polymicrobial, emergent failures, you reach for population-level reasoning — the epidemiologist's Bradford Hill criteria, which were built precisely for causation when no single factor is sufficient and you must weigh strength, consistency, and dose-response across many weak signals rather than demand one clean isolation.

The pattern is identical every time: the bug that defeats isolation is telling you which tool to pick up next.

What to actually do with this

If you run systems, two habits fall straight out of this, and they cost nothing but discipline.

First, make Koch's four steps the literal bar for the words “root cause.” Before that phrase goes in a postmortem, ask the four questions out loud: Is the suspect present in every occurrence and absent from healthy runs? Can I reproduce the failure with this cause alone? Does introducing it into a healthy system produce the failure? Does it fail by the same mechanism when I reproduce it? If you can't answer yes to all four, you are allowed to write “leading suspect” — you are not yet allowed to write “root cause.” That single rule will kill a huge fraction of the confident, wrong diagnoses that send teams chasing fixes for things that were merely present at the scene.

Second, treat every broken postulate as a routing decision, not a frustration. Can't reproduce it in staging? It's unculturable — switch to tracing and production observability; stop trying to grow it in a dish. Vanishes when you watch it? Heisenbug — change your observation method so you stop disturbing the timing. No single component broken? Polymicrobial — go systems-level and look at interactions, not units. Nothing in the code satisfies any postulate? Prion — you're in the wrong category; go look at config, data, and infrastructure. Each failure mode names its own next tool.

There's a companion principle worth stating in the same breath, because the two together are the whole of good incident practice: mitigate first, confirm later. Stop the bleeding on a pattern — roll back, flip the flag — the moment users are harmed, and save the rigorous causal proof for afterward. This essay is about that “afterward.” Acting fast on a correlation is correct when the building is on fire. But once the fire is out, do not let the provisional suspect from the heat of the incident quietly become the official root cause without earning it. A bacteriologist in 1884 would not have let it. Hold your causal claims to Koch's bar — and when a bug refuses to clear that bar, don't lower it. That's the bug telling you to change tools.

Sources

Robert Koch & Friedrich Loeffler formulated the postulates around 1884 (building on Jakob Henle; refined/published by Koch ~1890); Loeffler added the fourth (re-isolation) postulate. Koch identified the tuberculosis bacillus in 1882.
The asymptomatic carrier: Mary Mallon (“Typhoid Mary”), a cook believed to have infected up to ~57 people with typhoid (three confirmed deaths) while never falling ill — the first identified asymptomatic carrier of Salmonella Typhi in the U.S.
Unculturable agents: viruses require living host cells; the agents of syphilis (Treponema pallidum) and leprosy (Mycobacterium leprae) have historically resisted routine in-vitro culture.
Prions (misfolded proteins, no nucleic acid) behind mad-cow disease (BSE) break every postulate — the framework assumed a microbial culprit.
Stanley Falkow, “Molecular Koch's postulates applied to microbial pathogenicity” (1988): identify the virulence gene; disabling it reduces virulence, restoring it returns the disease — the same causal bar, new instruments.
Austin Bradford Hill's criteria (1965) for population-level causal inference when no single factor is sufficient.

Clearing Koch's bar requires a record of what actually happened.

“Present in every case,” “reproduces,” “re-isolated by the same mechanism” are all claims you can only make against a faithful record of events — which is exactly why the essay reaches for distributed tracing as the molecular method. The same gap shows up the moment an AI agent is the suspect in an incident: if all you have is its after-the-fact summary, you can't tell the cause from the thing that was merely present at 14:03. Chain of Consciousness anchors every agent action to a verifiable external record — the trace that lets “root cause” actually clear the bar instead of hanging the nearest bystander.

See a verified provenance chain · Hosted Chain of Consciousness

pip install chain-of-consciousness · npm install chain-of-consciousness

← Back to all posts