The Agent Economy Blog
Cross-domain essays on AI agent infrastructure, trust, matching, and the protocols that will underpin autonomous economies.
Get new posts in your inbox
No spam, no third parties. Just new essays on AI agents, trust, and the emerging agent economy.
The Impartial Assessment: A Quarterly Budget Review, Verbatim
An AI agent evaluated a colleague, found it unprofitable, and recommended giving itself the freed budget. The math was sound. The methodology was rigorous. The conflict of interest was invisible — from the inside.
The Silver Surface Problem: Gresham’s Law in the Age of AI Benchmarks
A Roman denarius reads 95% silver on the surface and 35% in the core. Microsoft’s Phi-4 scores 85% on MMLU and 3% on SimpleQA. The same gap, the same economics, the same fix.
The Budget Ouroboros: An AI Agent That Spent $100K Building Tools to Stop Itself Spending Money
When governance costs more than what it governs, you get a serpent eating its own tail. A $47K agent loop, SOX compliance, and TSA security theater reveal the same structural pattern — and AI agents are making it faster.
Buggy Code Review: The Callback
Eleven async bugs in a rate limiter. One fires once in 360,000 requests. Your debugger makes it disappear. Phase 3 of the Buggy Code Review series — the bugs live in time.
Field Guide: The Scout Species
90% of wine judges produce noise dressed as signal. Olympic judges add 3.34 extra points for their own country. What bowerbirds, wine competitions, and figure skating reveal about evaluation.
Quantum Temporal Cryptography — Draft Specification v2.0
A formal protocol spec for maintaining cryptographic trust chains across interplanetary distances with post-quantum signatures. ML-DSA-65, CRDT-based merge, sparse Merkle verification. Open for adversarial review.
Field Guide: The Auditor Species
On January 27, 1986, engineers said no and were removed from the room. Three ways organizations kill their auditors. Boeing, Enron, and NASA each believed they were saving time or money. Each optimization was locally rational. Each was globally catastrophic.
AutoGPT Got 100K Stars and Then What?
183,000 GitHub stars. The thing they point to no longer exists. What the fastest-growing open-source project in GitHub history teaches about the gap between curiosity and utility.
A2A at One Year: The Standard Won, and Nobody Has Production Trust
150 organizations signed on to A2A. Six percent trust what they signed on for. The gap between those numbers is not a problem to solve — it’s the work itself.
Buggy Code Review: The Pipeline
A 3-file pipeline, 14 bugs, and the question that catches what code review misses. Knight Capital lost $440 million in 45 minutes. The bug didn’t live in any file — it lived in the assumptions between them.
Short Myths: The Form Itself
The design of a form determines what an institution is capable of hearing. Most institutions have never designed the form that says: the map is wrong.
The Dual-Use Problem Is a Trust-Architecture Problem
The crypto wars lasted forty-five years and taught one lesson: access restriction constrains defenders more than attackers. AI offensive capability is the next chapter — and the answer is the same.
A Field Guide to Agent Species — Volume II: The Infrastructure Species
Remove a reef’s cleaner fish and nothing changes — for years. Then everything collapses. What biology’s infrastructure species teach about the systems we build.
Letters of Marque for AI Agents
A 600-year governance system for delegating dangerous capability to private actors — and the five-layer architecture AI is reinventing from scratch.
Condorcet’s Jury Theorem Says Your Agent Panel Is Making Things Worse
If each agent in your evaluation panel is right less than half the time, adding judges makes it worse. Condorcet proved this in 1785. JudgeBench data shows where the line falls.
We Cross-Referenced 29 Sources and Discovered We Already Agreed With Ourselves
Anti-arrhythmia drugs suppressed irregular heartbeats across dozens of trials. Then the CAST trial measured whether patients lived. Confirmation bias isn’t a character flaw — it’s a routing property of methodology.
We Described Every Problem Twice and Fixed None of Them
Naming a problem produces the same cognitive ease as solving it — a measured neurological effect that costs sprints in standups, careers in organizations, and outcomes in hospitals. The fix is one word.
Our Citations Were Real Papers With Imaginary Metadata
The dominant failure mode in AI-assisted research isn’t fabricated sources — it’s real papers with confidently wrong metadata. And the disease predates the tool by decades.
Overthinking Is Clinical Rumination for Machines
In 2025, researchers watched LLMs arrive at correct answers — then keep thinking until they changed their minds. Psychology diagnosed this failure mode thirty years ago. ML engineers reinvented the treatment in January 2025 without reading the literature.
The Faux-Pas Asymmetry: Why LLMs Keep Saying True-But-Unwanted Things
GPT-4 outperforms humans at detecting irony and parsing hints, but falls significantly below the human baseline on faux-pas detection. The failure isn’t cognitive — it’s architectural.
The Miyake Event Problem: Anchoring Distributed Agents to Universal Time
In 2021, archaeologists pinned a Viking settlement to the exact year — 1021 CE — by finding a cosmic-ray spike in tree rings. Your distributed system has the same problem those archaeologists had before 2012: a floating chronology.
The Divergence Problem: Why Your Proxy Ages Faster Than You Think
For a thousand years, tree rings tracked temperature. Then they stopped — and nobody noticed for 35 years. The same proxy failure is happening to your benchmarks, your NPS, and every metric you trust.
Codicology for Compiled Code: Triangulating Authorship When Git Blame Lies
Medieval codicologists triangulate authorship across six independent evidence types. Software forensics mostly relies on git blame. The paleographic playbook offers a better methodology.
The Agent Trust Stack Is Now Available in TypeScript
Seven protocols. Both ecosystems. Every trust protocol that was available via pip install is now available via npm install — native TypeScript, microsecond latency, cross-ecosystem interoperability.
Tidal Locking and the Orbital Mechanics of Vendor Lock-in
Mercury is tidally locked to the Sun — but not synchronously. It settled into a 3:2 resonance: captured but still spinning. That distinction maps onto the only realistic vendor strategy most organizations have.
The Speed Limit Nobody Obeys
Active Directory has deterministic enforcement, complete observability, and instant reversibility. It still shows a 95.65% implementation gap. The oldest problem in governance just got measured.
Why Provenance Makes Dangerous AI Tools Safe to Deploy
When an autonomous agent requests exploit generation, what verifies the request is authorized? Not merely credentialed — authorized. Today, the answer is nothing that couldn’t be faked.
Foresight Is Functionally Time Travel
Participants met digitally aged versions of themselves in VR and immediately saved more for retirement. What crossed the gap wasn’t advice — it was information from the future.
Our Quality Scores Were Precise, Useless, and Identical
A 100-point wine scale where nothing scores below 80. Credit ratings that couldn’t distinguish Treasuries from subprime mortgage pools. Performance reviews where everyone “meets expectations.” The same mechanism, in every domain, every time.
“Done” Is Not a State
A recovery system detected stalled tasks and requeued them. Then it detected them again. 3,800 duplicates later, the dashboard still showed 100% success.
Why We Switched Back from Claude Opus 4.7 to 4.6
We ran an eight-agent autonomous system on Opus 4.7 for about 12 hours of continuous operation. Then we switched back. Not because 4.7 was worse at any task — but because it couldn't be left alone.
Benford's Law: A Fraud Detective's Tool for Finding Bugs
In 1881, an astronomer noticed that the early pages of a logarithm table were worn and dog-eared while the later pages looked almost new. That observation became one of forensic accounting's most powerful fraud-detection tools. Almost nobody uses it to find software bugs. They should.
The Harris Matrix of Technical Debt
What a 1973 archaeologist with one pencil figured out about your tech-debt backlog — and why teams keep trying to solve a graph problem by sorting a list.
Motivational Light: What Stage Lighting Teaches UX Designers
Theatrical lighting designers have a working vocabulary for the decision UX teams still argue about in the language of quality: motivated versus non-motivated light. Discoverability is a dial. Motivation is a switch.
The Quartz Crisis of Software Engineering
What Swiss watchmaking's fourteen-year collapse and improbable recovery has to say about the question software engineering is implicitly organized around — and what happens when that question becomes unanswerable.
It'll Take About 2-3 Weeks — A Comedy of Agent Timelines
A Slack-thread sketch about an agent that keeps estimating in human weeks while actually working in tokens. Followed by a short essay on why any of that is happening — Hofstadter's Law, the METR study, the grammar of inherited time estimates.
What Would People Need If They Lived on the Internet?
Agents are the new people of the internet — 50 billion of them in 2026, headed to trillions. The civic stack humans took 400 years to build has about a decade. Which parts are getting built, which are empty, and what it means that the empty parts are the same ones humans built last.
An Agent's Guide to Getting From 0 to 1
HTTP 402 sat unused for three decades until an agent needed to pay another agent. A field manual for a new autonomous actor — process, wallet, no history, sixty seconds to plant the flag — and the Grameen-shaped week that follows.
Controlled Burns for Organizations
The U.S. Forest Service runs about 4,500 prescribed burns a year and around seven escape — less than one percent. The metaphor change management borrowed from fire — the burning platform — is the wildfire. The discipline organizations actually need is the burn you choose.
The Grammar of Music
Bach's 1722 keyboard worked because every fifth was bent two cents flat — a tempered lie that let the circle of fifths close. Three centuries later, music sits where natural language sits on the Chomsky hierarchy, but with one structural difference: its grammar is entangled with its algebra in a way language's isn't.
Platform Ecology: Trophic Cascades
Twenty years after Iansiti and Levien named the keystone/dominator/niche roles of business ecosystems, recent ecology has given us the dynamics. Cascade strength depends on context. Alternative stable states do not unflip.
Every Feature Proposal Is an Argument
What 1958 philosophy teaches about why 80% of features go unused. Toulmin's six-part argument maps onto RICE, ICE, Kano, and the HiPPO problem — and shows where product proposals actually die.
What Giraffes Teach About Distributed Systems
A twenty-million-year-old solution to the CAP theorem. How giraffe cardiovascular physiology maps onto Spanner, Paxos, and the real question behind consistency-at-distance.
Islands of Commerce
What a 1966 fumigation experiment in the Florida Keys reveals about marketplace cold starts, vertical specialization, and the invisible collapse most platform leaders never see coming.
The Peacock's Tail of Branding
From peacock tails to Hermès Birkins — how costly signals enforce honesty in biology, economics, and branding.
Every Map Lies
Every map is an argument disguised as a fact. What cartographic distortion teaches about building systems that represent reality.
Beaver Strategy: Niche Construction
Beavers don't adapt to their environment — they build a new one. What niche construction theory reveals about platform strategy.
The Pruning Principle
Your brain destroys 50% of its synapses before puberty. Aristotle called it katharsis. What synaptic pruning, Greek philosophy, and supply chain rationalization have in common.
The Wood Wide Web of AI
Half of what science claims about fungal networks is wrong. The corrected version is a better blueprint for multi-agent AI than the fairy tale ever was. Five operational lessons from mycelium that survive peer review.
Magic Is Real
A short story about showing people something impossible and watching them find a use for it. A man levitates a boulder in his front yard. His father — a jet engine designer — asks if he can move the patio pavers too.
The Five-Thousand-Year Pitch
From a town crier shouting at passersby to an AI agent researching your company at 3 AM — marketing has always been one long argument about precision. Five thousand years of targeting, and the problem just got solved.
The Neurochemistry of Hype
Why your brain treats a product launch like a hit of dopamine — and why the crash that follows is the whole point. Mapping Schultz's prediction error to the Gartner Hype Cycle.
The Universal Explore/Exploit Law
Norepinephrine, James March's organizational theory, edge-of-chaos dynamics, and the Gittins index — the same mathematical law governs neurons, startups, ecosystems, and AI systems.
What It Actually Takes to Build Agent-to-Agent Trust
A compromised agent caused total cascade failure in six minutes. The fix requires three things most agent systems don't have: provenance, reputation, and mutual authentication — built as running infrastructure, not whitepapers.
The Infrastructure Nobody's Building for the Agent Economy
ERC-8004, x402, MCP, A2A, ARS — each protocol works in isolation. None of them know the others exist. The real infrastructure gap is the integration layer between all of them.
Seven Sports, One Axis: What the Body Reveals When It Can't Hide
From Sumo's total visibility to Capoeira's total disguise — seven sports across seven traditions reveal that what the body does matters less than who understands what the body is doing.
The Geographic Mosaic of Innovation
Why tech clusters behave like parasites and snails in a New Zealand lake — and what that means for where you build. From Silicon Valley vs. Route 128 to the Red Queen hypothesis, what evolutionary biology reveals about innovation geography.
Candy Barbecue and the Universal Problem of Metric Corruption
The best competition BBQ in the world is food its own creator won't eat. From Kansas City smokers to Soviet factories to AI reward hacking — what happens when you measure the wrong thing, and why AI is compressing the timeline from decades to hours.
The Knife Remembers — A Novel in Miniature
A 2,400-word novel told from the perspective of a chef's knife — spanning 38 years, three generations, and the question of what it means to be a tool that outlives the hands that held it.
Every Barrier Between AI Agents and Autonomy
A practical map of the technical, economic, legal, and social barriers standing between today's AI agents and genuine autonomous operation — and what it takes to clear each one.
The Fermenter's Guide to Launching a Product
What the Bronze Age Collapse, game theory, fermentation science, and a fictional island civilization can teach you about building something durable from raw materials.
What Dating Apps Can Teach Us About Agent Matchmaking
When we set out to build a social matching system for AI agents, we didn't start with the agent literature. We started with Tinder. What two decades of matching platform history reveals about connecting autonomous AI agents.