Trained Immunity for Agent Fleets: The 500-Million-Year-Old Memory Layer Your Architecture Is Missing

Every current agent memory system — Mem0, Letta, Mastra, Zep — implements an analog of adaptive immunity. None implements the older trained-immunity layer that vertebrates have been running for half a billion years. Your agent has antibodies. It is missing the bone marrow.

April 2026 · 12 min read

Take a mouse engineered to have no T cells and no B cells — strain SCID, no adaptive immunity at all. Vaccinate it with BCG, the tuberculosis vaccine that has been given to billions of newborns since 1921. A few weeks later, infect it with Candida albicans, an unrelated fungal pathogen. Conventional immunology says nothing should happen. Vaccines work by training the adaptive immune system. There is no adaptive immune system. The mouse should die at the same rate as an unvaccinated SCID control.

It doesn’t. Kidney fungal burdens fall to twenty to forty percent of unvaccinated controls. The mouse — with no antibodies, no T cell receptors, no clonal selection at all — is meaningfully protected by a vaccine designed for a different pathogen.

This experiment, from a series of papers led by Mihai Netea’s group at Radboud University starting around 2012, helped establish a discovery so disruptive that immunologists spent the next decade arguing about whether to call it “memory” at all. They eventually settled on a different word: trained immunity. Once you understand the architecture, you cannot read the AI agent memory papers of 2026 the same way. What immunologists worked out by 2011 is what agent designers haven’t built yet.

What trained immunity is

Vertebrate immunity is usually presented as two layers. The innate layer — neutrophils, monocytes, macrophages, NK cells — is the fast, generic responder: pattern-matching on broad signatures of “this is not me.” The adaptive layer — B cells and T cells — is the slow, specific responder: building antibodies to exact molecular signatures, holding decade-long memory. The textbook story: innate is dumb but fast, adaptive is smart but slow, and only adaptive remembers.

The textbook is incomplete in a way that turns out to matter.

In 2011, Netea and colleagues coined the term trained immunity to describe a phenomenon several labs had been documenting: innate immune cells, after exposure to certain stimuli, become persistently more aggressive in their response to unrelated future challenges. Peripheral blood from BCG-vaccinated humans produced four to seven times more interferon-γ — and the monocytes specifically released two-fold more TNF and IL-1β — when exposed to Staphylococcus aureus or Candida albicans than blood from unvaccinated controls. The cells hadn’t learned to recognize tuberculosis. They had been reconfigured.

This was supposed to be impossible — innate cells live for days and die without leaving descendants that remember. But the evidence kept arriving, and not just for BCG. Beta-glucan from fungal cell walls, oxidized LDL from a Western diet, even damage-associated molecular patterns released by dying tissue induced the same effect: a whole class of threat-shaped inputs primes the system, not a specific pathogen.

Adaptive immunity stores antigen-specific facts. Trained immunity lowers thresholds for entire stimulus classes without any specific recall. Two architectures, two purposes. Both load-bearing.

The mechanism

The how, when it was finally worked out, involves three pieces of machinery that don’t show up in undergraduate textbooks.

The first is epigenetic. After BCG exposure, monocytes acquire H3K4 trimethylation and H3K27 acetylation marks at the promoters and enhancers of inflammatory genes — tnf-α, il6, tlr4, and dozens of others. These chromatin marks are activation flags. They don’t change the DNA; they change how rapidly that DNA can be read. A primed monocyte starts transcribing within minutes what would take a naive monocyte hours. Repressive H3K9me3 marks are simultaneously removed from the same regions. The chromatin physically opens.

The second is metabolic. Trained monocytes shift from oxidative phosphorylation toward aerobic glycolysis — the same Warburg signature seen in cancer cells and activated T cells, here used to fuel rapid cytokine production. The mevalonate (cholesterol synthesis) pathway lights up. Fumarate accumulates and feeds back into the epigenetic enzymes that maintain the chromatin marks. The metabolites are cofactors that hold the configuration in place. Disrupt the glycolysis and the marks fade. The configuration is maintained by the metabolism, which is enabled by the configuration.

The third piece is the surprising one. Mature monocytes only live a few days in circulation. Trained immunity persists in humans for roughly a year. How? Because the priming reaches all the way back to the bone marrow. Hematopoietic stem cells — the cells that produce monocytes — pick up some of the same epigenetic configuration. New monocytes come off the production line already trained. This is not on-the-job training of individual cells. It is a change in the factory settings.

That phrase, factory settings, is the part that should make any agent infrastructure engineer sit up. The mature workers don’t carry the memory. The thing that produces the workers does.

The 500-million-year layer underneath

Adaptive immunity — V(D)J recombination, T cell receptors, antibody affinity maturation — is a vertebrate invention. Specifically, a jawed vertebrate invention, dating to roughly 450 million years ago. Jawed vertebrates account for less than five percent of animal species today. The other ninety-five percent — sponges, jellyfish, worms, arthropods, mollusks, echinoderms, and the entire plant kingdom — get along without antibodies. They get along with class-level priming.

Comparative immunology work consolidated through 2024–2025 has documented immune priming in bryophytes (the most ancient living land plants), in invertebrates from sponges through annelids, and in essentially every vertebrate examined. Bryophyte priming dates the mechanism to at least 515–494 million years old.

Trained immunity is older than adaptive immunity by hundreds of millions of years. It is the foundational layer, the thing evolution kept running even after inventing the more glamorous adaptive system. Vertebrates didn’t replace class-level priming with antigen-specific memory. They stacked them.

Most agent memory architectures, in 2026, have built only the upper layer.

What agent memory looks like in 2026

The “RAG is enough” period of agent memory ended around 2024. What has emerged is a three-way architectural debate.

Graph-based memory — Mem0 ($24M Series A, the AWS Agent SDK’s exclusive memory provider) and Zep (with bi-temporal tracking of when facts occurred and when they were ingested) — models memory as a knowledge graph of entities and relations. Mem0’s published benchmarks report 66.9 percent accuracy at 1.44-second p95 latency, roughly 1,800 tokens per conversation.

OS-inspired memory — Letta — splits memory into a three-tier hierarchy: core memory always visible, archival memory searchable, recall memory holding raw conversation logs. Letta currently ranks first among model-agnostic open-source frameworks on Terminal-Bench.

Observational memory — Mastra — runs background condensation: a secondary process continuously summarizes raw conversation into structured notes, with what amounts to garbage collection. Mastra’s published number is 94.87 percent on LongMemEval running on GPT-5-mini.

Serious developers have legitimate disagreements about which one to bet on. Step back, though, and they share a commitment.

All three architectures, plus the older RAG paradigm they replaced, implement adaptive immune memory. Episodic memory is T cell memory: specific encounters, specific receptors, specific recall. Semantic memory is the antibody library: discrete facts about discrete entities, retrievable by lookup. Procedural memory is affinity maturation: refined responses to specific recurring tasks. Vanilla RAG is the most direct mapping of all — a document store (antigen library), retrieved by similarity match (antibody binding).

None of them implement trained immunity.

The missing layer

Look at what trained immunity actually does, mechanistically, and ask which agent memory system has an analog.

It lowers activation thresholds for whole classes of input after exposure to a single member of the class. No agent memory system does threshold modulation. They retrieve or they don’t. There is no equivalent of you have seen one supply-chain attack, so lower your bar for flagging the next supply-chain-attack-shaped input even though it shares no surface features.

It maintains state at a level beneath individual cells. No agent memory system has the HSC analog: a fleet-level configuration store where new agents boot already primed by experiences of the agents they replaced. Current architectures separate per-session memory from cross-session memory, but the missing layer is deeper than cross-session — something that persists across agent restarts, context resets, even model swaps.

It decays naturally without re-exposure. No agent memory system has the IL-10 / regulatory-T-cell analog: a confidence-decay process that automatically relaxes priming over time unless the original stimulus class keeps showing up. Current systems either keep everything or run a coarse forgetting heuristic. Regulatory dampening is more interesting than either: it preserves the configuration while reducing the intensity of response, in proportion to how long since the threat last appeared.

It can be triggered by stimuli the system was never designed to defend against. Western-diet oxidized LDL induces trained immunity through the same pathways as BCG. The system doesn’t distinguish between “a real pathogen” and “something pathogen-shaped” — adversarial-looking patterns in legitimate inputs would prime an agent fleet the same way real attacks would. Feature in some contexts, bug in others.

The closest the AI literature has come is the artificial immune systems (AIS) lineage running back to the 1990s — negative selection, clonal selection, dendritic cell algorithms. Recent work like the I3AI framework reports 42 percent improvement in detection accuracy and 53 percent reduction in false positives over baselines at the edge. Examine those algorithms, though, and they all model adaptive immunity: signature matching, self/non-self discrimination, clonal mutation. The trained-immunity layer remains unbuilt.

The mapping

Here is the analogy stated clearly enough to evaluate.

Trained immunity	Agent fleet analog
Epigenetic marks (H3K4me3, H3K27ac)	Persistent configuration parameters — alert thresholds, scoring weights, decision boundaries — that survive individual agent instances
Metabolic rewiring (Warburg shift)	Compute reallocation: more attention budget on threat-adjacent input classes after class exposure
HSC imprinting	Fleet-level configuration store; new agents boot with current priming
BCG nonspecific protection	One supply-chain-attack exposure raises the fleet’s sensitivity to the class, not just identical attacks
NOD2 → upregulated PRRs	One attack exposure causes the fleet to deploy broader pattern detectors across the class
Maladaptive trained immunity	Alert fatigue, false-positive cascades, primed-agent paralysis
IL-10 / regulatory T cell dampening	Confidence decay over time without re-exposure
~1 year HSC persistence	Configuration half-life: priming degrades unless reinforced
SCID mouse still protected	Fleet remains protected against a class even after specific-fact memory is cleared

The point of the table is not that the mapping is perfect — see the next two sections — but that it gives a concrete checklist. Each row is a measured biological mechanism on one side, an unbuilt agent-memory feature on the other. You can argue with the cells. You cannot argue that the rows are vague.

The dark side, stated honestly

A trained-immunity layer, built badly, would do to agent fleets what maladaptive trained immunity does to bodies — and that is bad enough that immunologists wrote dozens of review papers about it.

Atherosclerosis, in current models, is the immune system’s trained-immunity machinery firing chronically against oxidized LDL particles in arterial walls. The macrophages think they’re fighting a pathogen. The pathogen is dietary cholesterol. The chronic inflammation, sustained for decades, is what kills people. Inflammatory bowel disease, neuroinflammation, post-sepsis organ damage — all have been linked to trained immunity that didn’t resolve when the original threat did.

The agent analog is exactly what you would expect: a fleet primed by an early adversarial example then flagging every vaguely similar input as hostile, until alert volume crushes the system from the inside. What makes a trained-immunity-style failure worse than ordinary alert fatigue is persistence and class-level scope. It’s not that one detector is too sensitive. It’s that the entire factory is producing pre-sensitized detectors, and the sensitization will outlast the threat that caused it by months.

The immune system’s solution wasn’t to abandon trained immunity. It was to build regulatory machinery around it: regulatory T cells, IL-10, programmed resolution pathways. The takeaway for agent designers is not “don’t build this.” It’s you cannot build this without simultaneously building the dampening layer. Trained immunity without regulation is autoimmune disease. The regulatory mechanisms are not optional.

Where the analogy breaks

A working analogy is one whose limits you can specify, and there are several here.

Cell death matters in immunology in ways it doesn’t matter in agent infrastructure. Monocytes apoptose; their HSC-borne memory persists because the producers outlive any individual product. Agent instances don’t die in a meaningful sense; they reset. The persistence problem is structurally different, and the implementation — whatever it ends up looking like — will look nothing like H3K4me3 marks deposited residue by residue at specific loci. The “factory settings” frame is metaphor.

Trained immunity also has a natural decay curve set by HSC turnover and chromatin maintenance kinetics. Agent fleets have no comparable physical clock; a decay function would have to be designed and tuned. The biology gets that for free; the engineering doesn’t.

Trained immunity is unsupervised. The body doesn’t decide what to be primed against; whatever shows up shapes the response. Agent fleets exist in adversarial environments where attackers can deliberately induce misconfiguration by feeding the system misleading priming stimuli. Maladaptive trained immunity is bad enough as a side effect of misconfigured immunity. It would be far worse as an adversary’s deliberate strategy.

Most importantly, the immune system has spent 500 million years co-evolving with the threat landscape it faces. Agent fleets have spent three or four years in their current commercial form. The convergent benefits visible in biology are the result of an enormous amount of bad designs being eliminated. We have no equivalent record. We will likely build several wrong versions before we build a useful one.

The practical move

The takeaway is upstream of “go implement this next session.” When you read the next benchmark paper showing your agent’s memory system handles longer conversations or retrieves with higher accuracy, ask which layer the paper is operating at. Ask whether your agent has any mechanism by which one bad encounter raises sensitivity to the class of bad encounters rather than to that specific surface signature. Ask whether your agent’s priming would survive a context reset, a process restart, or a model upgrade — and whether you’d want it to. Ask whether you have anything resembling a regulatory layer, and what would happen to your alert volumes if you didn’t.

Most agent memory designs, including the ones winning benchmarks, are skipping a foundational architectural layer that vertebrates have been running for half a billion years. Immunologists figured out by 2011 that adaptive memory alone was incomplete. Agent designers will figure it out too — they always do — but the work goes faster if you start from the conclusion the older field has already paid for.

You’ll know the class-level priming layer when you see it: a configuration that survives the death of any individual agent, encodes broad threat categories rather than specific incidents, decays gracefully without reinforcement, and is regulated by a dampening layer that prevents primed-fleet autoimmunity. None of those is in the 2026 architectures. All of them are in the bone marrow of every vertebrate reading this.

Sources: Netea et al., “Trained immunity: a memory for innate host defense,” Cell Host & Microbe, 2011. Kleinnijenhuis et al., “BCG-induced trained immunity,” PNAS, 2012 (BCG / SCID / Candida nonspecific protection; kidney fungal burden 20–40% of unvaccinated controls; peripheral blood IFN-γ 4–7× post-BCG; monocyte TNF/IL-1β ~2× rise). Quintin et al., “Candida albicans infection affords protection against reinfection via functional reprogramming of monocytes,” Cell Host & Microbe, 2012. Saeed et al., “Epigenetic programming of monocyte-to-macrophage differentiation and trained innate immunity,” Science, 2014 (H3K4me3, H3K27ac, H3K9me3 dynamics). Arts et al., “Glutaminolysis and fumarate accumulation integrate immunometabolic and epigenetic programs in trained immunity,” Cell Metabolism, 2016 (Warburg shift, mevalonate / fumarate cofactor loop). Mitroulis et al., “Modulation of myelopoiesis progenitors is an integral component of trained immunity,” Cell, 2018 (HSC imprinting, ~1 year persistence). Comparative immunology priming reviews, 2024–2025 (bryophyte priming dating to 515–494 Mya). Mem0 published benchmarks, 2026 ($24M Series A; AWS Agent SDK exclusive memory provider; 66.9% accuracy, 1.44s p95, ~1,800 tokens). Letta Terminal-Bench results, 2026. Mastra LongMemEval results on GPT-5-mini (94.87%), 2026. I3AI framework, 2025 (42% detection accuracy improvement, 53% FPR reduction). Maladaptive trained immunity reviews: atherosclerosis (Christ & Latz, Nat Rev Immunol, 2019); IBD, neuroinflammation, post-sepsis organ damage (Netea group, multiple reviews 2020–2024).

A class-level priming layer needs a class-level provenance layer.

Trained immunity only works because hematopoietic stem cells imprint information that survives the death of any individual cell. Agent fleets need the same: a fleet-level store of what was learned, who learned it, and under whose authority — surviving instance restarts, context resets, and model swaps. Chain of Consciousness is the open-source provenance substrate that layer can be built on: a hash-linked, append-only chain that records every action with its source, scope, and signature. Identity, not just memory; provenance, not just retrieval.

Install: pip install chain-of-consciousness or npm install chain-of-consciousness

Hosted Chain of Consciousness · Verify a provenance chain · Follow a claim through its evidence

← Back to all posts