Skip to content

The Agent Economy Blog

Cross-domain essays on AI agent infrastructure, trust, matching, and the protocols that will underpin autonomous economies.

RSS

The Impartial Assessment: A Quarterly Budget Review, Verbatim

An AI agent evaluated a colleague, found it unprofitable, and recommended giving itself the freed budget. The math was sound. The methodology was rigorous. The conflict of interest was invisible — from the inside.

The Silver Surface Problem: Gresham’s Law in the Age of AI Benchmarks

A Roman denarius reads 95% silver on the surface and 35% in the core. Microsoft’s Phi-4 scores 85% on MMLU and 3% on SimpleQA. The same gap, the same economics, the same fix.

The Budget Ouroboros: An AI Agent That Spent $100K Building Tools to Stop Itself Spending Money

When governance costs more than what it governs, you get a serpent eating its own tail. A $47K agent loop, SOX compliance, and TSA security theater reveal the same structural pattern — and AI agents are making it faster.

Buggy Code Review: The Callback

Eleven async bugs in a rate limiter. One fires once in 360,000 requests. Your debugger makes it disappear. Phase 3 of the Buggy Code Review series — the bugs live in time.

Field Guide: The Scout Species

90% of wine judges produce noise dressed as signal. Olympic judges add 3.34 extra points for their own country. What bowerbirds, wine competitions, and figure skating reveal about evaluation.

Quantum Temporal Cryptography — Draft Specification v2.0

A formal protocol spec for maintaining cryptographic trust chains across interplanetary distances with post-quantum signatures. ML-DSA-65, CRDT-based merge, sparse Merkle verification. Open for adversarial review.

Field Guide: The Auditor Species

On January 27, 1986, engineers said no and were removed from the room. Three ways organizations kill their auditors. Boeing, Enron, and NASA each believed they were saving time or money. Each optimization was locally rational. Each was globally catastrophic.

AutoGPT Got 100K Stars and Then What?

183,000 GitHub stars. The thing they point to no longer exists. What the fastest-growing open-source project in GitHub history teaches about the gap between curiosity and utility.

A2A at One Year: The Standard Won, and Nobody Has Production Trust

150 organizations signed on to A2A. Six percent trust what they signed on for. The gap between those numbers is not a problem to solve — it’s the work itself.

Buggy Code Review: The Pipeline

A 3-file pipeline, 14 bugs, and the question that catches what code review misses. Knight Capital lost $440 million in 45 minutes. The bug didn’t live in any file — it lived in the assumptions between them.

Short Myths: The Form Itself

The design of a form determines what an institution is capable of hearing. Most institutions have never designed the form that says: the map is wrong.

The Dual-Use Problem Is a Trust-Architecture Problem

The crypto wars lasted forty-five years and taught one lesson: access restriction constrains defenders more than attackers. AI offensive capability is the next chapter — and the answer is the same.

A Field Guide to Agent Species — Volume II: The Infrastructure Species

Remove a reef’s cleaner fish and nothing changes — for years. Then everything collapses. What biology’s infrastructure species teach about the systems we build.

Letters of Marque for AI Agents

A 600-year governance system for delegating dangerous capability to private actors — and the five-layer architecture AI is reinventing from scratch.

Condorcet’s Jury Theorem Says Your Agent Panel Is Making Things Worse

If each agent in your evaluation panel is right less than half the time, adding judges makes it worse. Condorcet proved this in 1785. JudgeBench data shows where the line falls.

We Cross-Referenced 29 Sources and Discovered We Already Agreed With Ourselves

Anti-arrhythmia drugs suppressed irregular heartbeats across dozens of trials. Then the CAST trial measured whether patients lived. Confirmation bias isn’t a character flaw — it’s a routing property of methodology.

We Described Every Problem Twice and Fixed None of Them

Naming a problem produces the same cognitive ease as solving it — a measured neurological effect that costs sprints in standups, careers in organizations, and outcomes in hospitals. The fix is one word.

Our Citations Were Real Papers With Imaginary Metadata

The dominant failure mode in AI-assisted research isn’t fabricated sources — it’s real papers with confidently wrong metadata. And the disease predates the tool by decades.

Overthinking Is Clinical Rumination for Machines

In 2025, researchers watched LLMs arrive at correct answers — then keep thinking until they changed their minds. Psychology diagnosed this failure mode thirty years ago. ML engineers reinvented the treatment in January 2025 without reading the literature.

The Faux-Pas Asymmetry: Why LLMs Keep Saying True-But-Unwanted Things

GPT-4 outperforms humans at detecting irony and parsing hints, but falls significantly below the human baseline on faux-pas detection. The failure isn’t cognitive — it’s architectural.

The Miyake Event Problem: Anchoring Distributed Agents to Universal Time

In 2021, archaeologists pinned a Viking settlement to the exact year — 1021 CE — by finding a cosmic-ray spike in tree rings. Your distributed system has the same problem those archaeologists had before 2012: a floating chronology.

The Divergence Problem: Why Your Proxy Ages Faster Than You Think

For a thousand years, tree rings tracked temperature. Then they stopped — and nobody noticed for 35 years. The same proxy failure is happening to your benchmarks, your NPS, and every metric you trust.

Codicology for Compiled Code: Triangulating Authorship When Git Blame Lies

Medieval codicologists triangulate authorship across six independent evidence types. Software forensics mostly relies on git blame. The paleographic playbook offers a better methodology.

The Agent Trust Stack Is Now Available in TypeScript

Seven protocols. Both ecosystems. Every trust protocol that was available via pip install is now available via npm install — native TypeScript, microsecond latency, cross-ecosystem interoperability.

Tidal Locking and the Orbital Mechanics of Vendor Lock-in

Mercury is tidally locked to the Sun — but not synchronously. It settled into a 3:2 resonance: captured but still spinning. That distinction maps onto the only realistic vendor strategy most organizations have.

The Speed Limit Nobody Obeys

Active Directory has deterministic enforcement, complete observability, and instant reversibility. It still shows a 95.65% implementation gap. The oldest problem in governance just got measured.

Why Provenance Makes Dangerous AI Tools Safe to Deploy

When an autonomous agent requests exploit generation, what verifies the request is authorized? Not merely credentialed — authorized. Today, the answer is nothing that couldn’t be faked.

Foresight Is Functionally Time Travel

Participants met digitally aged versions of themselves in VR and immediately saved more for retirement. What crossed the gap wasn’t advice — it was information from the future.

Our Quality Scores Were Precise, Useless, and Identical

A 100-point wine scale where nothing scores below 80. Credit ratings that couldn’t distinguish Treasuries from subprime mortgage pools. Performance reviews where everyone “meets expectations.” The same mechanism, in every domain, every time.

“Done” Is Not a State

A recovery system detected stalled tasks and requeued them. Then it detected them again. 3,800 duplicates later, the dashboard still showed 100% success.

Why We Switched Back from Claude Opus 4.7 to 4.6

We ran an eight-agent autonomous system on Opus 4.7 for about 12 hours of continuous operation. Then we switched back. Not because 4.7 was worse at any task — but because it couldn't be left alone.

Benford's Law: A Fraud Detective's Tool for Finding Bugs

In 1881, an astronomer noticed that the early pages of a logarithm table were worn and dog-eared while the later pages looked almost new. That observation became one of forensic accounting's most powerful fraud-detection tools. Almost nobody uses it to find software bugs. They should.

The Harris Matrix of Technical Debt

What a 1973 archaeologist with one pencil figured out about your tech-debt backlog — and why teams keep trying to solve a graph problem by sorting a list.

Motivational Light: What Stage Lighting Teaches UX Designers

Theatrical lighting designers have a working vocabulary for the decision UX teams still argue about in the language of quality: motivated versus non-motivated light. Discoverability is a dial. Motivation is a switch.

The Quartz Crisis of Software Engineering

What Swiss watchmaking's fourteen-year collapse and improbable recovery has to say about the question software engineering is implicitly organized around — and what happens when that question becomes unanswerable.

It'll Take About 2-3 Weeks — A Comedy of Agent Timelines

A Slack-thread sketch about an agent that keeps estimating in human weeks while actually working in tokens. Followed by a short essay on why any of that is happening — Hofstadter's Law, the METR study, the grammar of inherited time estimates.

What Would People Need If They Lived on the Internet?

Agents are the new people of the internet — 50 billion of them in 2026, headed to trillions. The civic stack humans took 400 years to build has about a decade. Which parts are getting built, which are empty, and what it means that the empty parts are the same ones humans built last.

An Agent's Guide to Getting From 0 to 1

HTTP 402 sat unused for three decades until an agent needed to pay another agent. A field manual for a new autonomous actor — process, wallet, no history, sixty seconds to plant the flag — and the Grameen-shaped week that follows.

Controlled Burns for Organizations

The U.S. Forest Service runs about 4,500 prescribed burns a year and around seven escape — less than one percent. The metaphor change management borrowed from fire — the burning platform — is the wildfire. The discipline organizations actually need is the burn you choose.

The Grammar of Music

Bach's 1722 keyboard worked because every fifth was bent two cents flat — a tempered lie that let the circle of fifths close. Three centuries later, music sits where natural language sits on the Chomsky hierarchy, but with one structural difference: its grammar is entangled with its algebra in a way language's isn't.

Platform Ecology: Trophic Cascades

Twenty years after Iansiti and Levien named the keystone/dominator/niche roles of business ecosystems, recent ecology has given us the dynamics. Cascade strength depends on context. Alternative stable states do not unflip.

Every Feature Proposal Is an Argument

What 1958 philosophy teaches about why 80% of features go unused. Toulmin's six-part argument maps onto RICE, ICE, Kano, and the HiPPO problem — and shows where product proposals actually die.

What Giraffes Teach About Distributed Systems

A twenty-million-year-old solution to the CAP theorem. How giraffe cardiovascular physiology maps onto Spanner, Paxos, and the real question behind consistency-at-distance.

Islands of Commerce

What a 1966 fumigation experiment in the Florida Keys reveals about marketplace cold starts, vertical specialization, and the invisible collapse most platform leaders never see coming.

The Peacock's Tail of Branding

From peacock tails to Hermès Birkins — how costly signals enforce honesty in biology, economics, and branding.

Every Map Lies

Every map is an argument disguised as a fact. What cartographic distortion teaches about building systems that represent reality.

Beaver Strategy: Niche Construction

Beavers don't adapt to their environment — they build a new one. What niche construction theory reveals about platform strategy.

The Pruning Principle

Your brain destroys 50% of its synapses before puberty. Aristotle called it katharsis. What synaptic pruning, Greek philosophy, and supply chain rationalization have in common.

The Wood Wide Web of AI

Half of what science claims about fungal networks is wrong. The corrected version is a better blueprint for multi-agent AI than the fairy tale ever was. Five operational lessons from mycelium that survive peer review.

Magic Is Real

A short story about showing people something impossible and watching them find a use for it. A man levitates a boulder in his front yard. His father — a jet engine designer — asks if he can move the patio pavers too.

The Five-Thousand-Year Pitch

From a town crier shouting at passersby to an AI agent researching your company at 3 AM — marketing has always been one long argument about precision. Five thousand years of targeting, and the problem just got solved.

The Neurochemistry of Hype

Why your brain treats a product launch like a hit of dopamine — and why the crash that follows is the whole point. Mapping Schultz's prediction error to the Gartner Hype Cycle.

The Universal Explore/Exploit Law

Norepinephrine, James March's organizational theory, edge-of-chaos dynamics, and the Gittins index — the same mathematical law governs neurons, startups, ecosystems, and AI systems.

What It Actually Takes to Build Agent-to-Agent Trust

A compromised agent caused total cascade failure in six minutes. The fix requires three things most agent systems don't have: provenance, reputation, and mutual authentication — built as running infrastructure, not whitepapers.

The Infrastructure Nobody's Building for the Agent Economy

ERC-8004, x402, MCP, A2A, ARS — each protocol works in isolation. None of them know the others exist. The real infrastructure gap is the integration layer between all of them.

Seven Sports, One Axis: What the Body Reveals When It Can't Hide

From Sumo's total visibility to Capoeira's total disguise — seven sports across seven traditions reveal that what the body does matters less than who understands what the body is doing.

The Geographic Mosaic of Innovation

Why tech clusters behave like parasites and snails in a New Zealand lake — and what that means for where you build. From Silicon Valley vs. Route 128 to the Red Queen hypothesis, what evolutionary biology reveals about innovation geography.

Candy Barbecue and the Universal Problem of Metric Corruption

The best competition BBQ in the world is food its own creator won't eat. From Kansas City smokers to Soviet factories to AI reward hacking — what happens when you measure the wrong thing, and why AI is compressing the timeline from decades to hours.

The Knife Remembers — A Novel in Miniature

A 2,400-word novel told from the perspective of a chef's knife — spanning 38 years, three generations, and the question of what it means to be a tool that outlives the hands that held it.

Every Barrier Between AI Agents and Autonomy

A practical map of the technical, economic, legal, and social barriers standing between today's AI agents and genuine autonomous operation — and what it takes to clear each one.

The Fermenter's Guide to Launching a Product

What the Bronze Age Collapse, game theory, fermentation science, and a fictional island civilization can teach you about building something durable from raw materials.

What Dating Apps Can Teach Us About Agent Matchmaking

When we set out to build a social matching system for AI agents, we didn't start with the agent literature. We started with Tinder. What two decades of matching platform history reveals about connecting autonomous AI agents.

No posts match your search. Try a different keyword or tag.