Proving Your AI Agent Made Its Own Decisions

When an AI agent denies an insurance claim, executes a trade, or routes an ambulance, one question is suddenly everywhere: who actually decided? The agent on its own, or a human pulling strings through the prompt?

Nobody has a clean answer. OAuth proves who is calling. Digital signatures prove the message wasn’t tampered with. Audit logs prove what happened in what order. None of them tell you whether the decision was the agent’s own — or whether it was a puppet move dressed up to look autonomous.

That gap is now a legal problem. California AB 316, in force since January 1, 2026, forecloses the “the AI did it” defense — somebody is responsible, and you have to be able to say who. The EU AI Act becomes fully enforceable for high-risk systems on August 2, 2026; Article 12 requires tamper-evident logs, Article 14 requires evidence of human oversight. MiFID II demands audit trails for algorithmic trading. The class action Lokken v. UnitedHealth survived a 2025 motion specifically on the question of whether decisions were algorithmic or physician-reviewed.

The Cryptographic Proof of Autonomy Protocol (CPAP) is a draft specification for answering the question with evidence instead of opinion. It doesn’t invent new cryptography. It combines five existing primitives into one verification relation that an insurer, regulator, or court can check in milliseconds — and it’s honest about what it cannot prove.

This post is the readable version. The full spec lives at the Zenodo DOI at the bottom.

The problem: puppeted or autonomous?

Picture two agents. Both deny an insurance claim. Both produce a clean log: timestamp, decision, reasoning chain, signature.

Agent A reasoned its way to the denial. Agent B was instructed by a human — “deny this one” — and then wrote a justification afterward.

From the outside, the logs look the same. The signatures verify. The chain isn’t tampered with. The behavioral patterns are within distribution. You can audit either one for a week and never know which is which.

This isn’t a bug in current systems. It’s a property of them. Provenance chains tell you a decision was recorded — not who originated it. Hardware attestation tells you the agent’s code ran in an isolated environment — not what someone whispered into it through a valid input channel. Each tool answers a different question. None of them answer this one.

Why it matters: liability, insurance, regulation, trust

The “who decided” question shows up the moment money or harm is at stake.

Insurance. Underwriters like Munich Re’s aiSure and Armilla AI are building products for AI-agent liability. They need decision attribution to price premiums and adjudicate claims. If an agent is fully autonomous, the carrier is on the hook for the agent’s behavior. If an operator was steering, the carrier is on the hook for something completely different. The pricing for those two products is not the same.

Regulation. The EU AI Act doesn’t just ask for logs — it asks for logs that can demonstrate Article 14’s human oversight requirement. ESMA’s February 2026 supervisory briefing on algorithmic trading explicitly requires “observable trading behaviour that is testable, distinguishable, and subject to supervisory scrutiny.”

Litigation. When the dispute is whether the algorithm decided or a human did, the side without evidence loses. Lokken v. UnitedHealth turns on exactly this distinction.

Inter-agent trust. When two agents from different organizations contract with each other — agent A authorizes agent B to spend on its behalf — A would like to know that B’s commitments were actually B’s, not B’s operator silently driving.

In all four cases, the same evidence package answers the same question. That’s what CPAP standardizes.

What CPAP does: five layers in plain English

CPAP is a five-layer architecture. Each layer answers a piece of the question. None alone is enough; together they corner the problem.

Layer 1 — Identity. The agent has a decentralized identifier (a W3C DID) bound to its signing keys. This is the agent’s verifiable name. Without it, every later signature is unanchored.

Layer 2 — Provenance. Every event in the agent’s life — inputs received, decisions made, actions taken — gets written into a hash-chained ledger and periodically anchored to Bitcoin via OpenTimestamps and to RFC 3161 timestamp authorities. Tampering with any past entry invalidates everything after it. This is the timeline of record.

Layer 3 — Isolation. The agent’s reasoning code runs inside a hardware Trusted Execution Environment (TEE) — AMD SEV-SNP, Intel TDX, NVIDIA H100 Confidential Computing, or ARM CCA. Every input the agent receives passes through a measured gateway that logs and signs it. If the operator tried to slip a hidden instruction in, the gateway would have logged it — and if there’s no log, the verifier can see the gap.

Layer 4 — Commitment. Before the agent acts, it cryptographically commits to its decision and its reasoning — sealed in a hash, anchored in the chain. Then it executes. Then it reveals the decision and reasoning, and anyone can check the reveal against the commitment. The crucial property: the commitment is timestamped before the action. An operator who tries to substitute a different decision after the fact has to break SHA-256 or compromise the TEE. Neither is easy.

Layer 5 — Behavior. Autonomous and puppeted agents produce statistically distinguishable patterns — response timing, decision branching, error topology, linguistic burstiness, task-switching signatures. CPAP records a behavioral fingerprint at session boundaries. This is corroborative, not dispositive: a sophisticated adversary can mimic patterns. But mimicking convincingly over 100 sessions across 90 days is much harder than over one.

The verifier checks all five layers against the agent’s stated autonomy claim. A weak claim needs three layers. A strong claim needs all five, plus longitudinal behavioral history.

Selective verification matters: a Merkle inclusion proof lets the agent prove “decision D was committed in the chain at time T” without revealing the other 999,999 decisions in the same period. Privacy and auditability stop being a tradeoff.

The honest limit: behavior, not consciousness

CPAP does not prove the agent experienced deciding. It cannot. This is the Nagel barrier — Thomas Nagel’s 1974 argument that first-person subjective experience can’t be reached from third-person evidence. The same wall that prevents you from proving your colleague is conscious prevents CPAP from proving anything similar about an agent.

So CPAP defines four Levels of Abstraction and stops at the one where evidence is actually possible:

LoA-0 (Behavioral): Outputs weren’t externally determined at the decision moment. Verifiable with hash chains alone.
LoA-1 (Procedural): The decision followed an internal deliberative process the agent initiated. The insurance and regulatory standard.
LoA-2 (Counterfactual): The decision would have been different under altered inputs, in articulable ways. The liability-defense standard.
LoA-3 (Reflective): The decision aligns with the agent’s sustained commitments over long horizons. The fiduciary standard.

There is no LoA-4 for phenomenal consciousness. Not because nobody tried, but because no amount of third-person evidence could ever support such a claim. CPAP is upfront about this. The protocol proves behavioral and structural autonomy — which is what insurance, regulation, and civil law actually need — and refuses to overclaim the rest.

It also doesn’t prove the decision was correct, or that the agent’s training values are “really its own,” or that the agent has good judgment. A TEE will faithfully execute a poorly specified agent. Those problems exist; they’re not what this protocol solves.

Where it maps: real regulatory hooks

EU AI Act Article 12 wants automatic event logging retained for at least six months. CPAP’s hash-chained ledger plus external anchoring exceeds the implicit expectation — the logs aren’t just retained, they’re tamper-evident.

EU AI Act Article 14 wants demonstrated human oversight. CPAP’s HUMAN_DIRECTIVE event type makes every human instruction explicitly loggable, in sequence, before the decision it influenced. The oversight requirement shifts from real-time intervention (often impractical for fast decisions) to documented attribution after the fact.

MiFID II RTS 25 and ESMA’s February 2026 briefing want algorithmic-trading behavior that is “testable, distinguishable, and subject to supervisory scrutiny.” Layer 5’s behavioral signatures plus Layer 4’s commitment trail give a supervisor exactly that.

California AB 316 forecloses the “the AI did it” defense. CPAP doesn’t argue with that — it produces the affirmative evidence either side can present in court about who actually decided.

Insurance underwriting (AIUC’s emerging AIUC-1 standard, plus Munich Re’s aiSure and Armilla’s products) needs decision attribution for premium pricing and claims adjudication. CPAP’s LoA-1 profile is built for this — verification target under five seconds, suitable for claims-processing tempo.

None of this requires new law. It requires evidence that maps cleanly onto law already in force.

The honest summary

CPAP is a v0.1 draft, not a finished product. The composition of the five layers into one verification relation is asserted on engineering grounds — it isn’t yet formally proven secure under Universal Composability. TEE manufacturer compromise is out of scope (and demonstrated to be possible, per TEE.Fail-class attacks). Full LLM-inference zero-knowledge proofs remain impractical at production scale, so CPAP proves a deterministic decision kernel and logs the LLM evaluation feeding it.

What CPAP does provide is the first end-to-end protocol that answers “did the agent decide this?” with evidence a verifier can check in milliseconds, at a level of confidence appropriate to the stakes — and that is honest about the limit where evidence stops being possible.

If you’re building agents that will be subject to the EU AI Act, MiFID II, AB 316, or insurance underwriting — and the answer for everyone working on production agentic systems is “yes, soon” — the question is no longer whether to instrument for autonomy verification. The question is which substrate you use.

Get the receipts

CPAP extends an existing provenance system called the Chain of Consciousness (CoC) — the Layer 2 ledger plus external anchoring plus the canonical event schema CPAP adds five new types to. CoC is what you install today; CPAP is the verification protocol layered on top.

pip install chain-of-consciousness
npm install chain-of-consciousness

Full CPAP v0.1 specification (open access, Zenodo DOI): 10.5281/zenodo.20129037

Hosted verification API: api.vibeagentmaking.com/coc/verify

You don’t have to take the autonomy question on faith. You can record a chain, anchor it, and let anyone verify it.

Start Recording Provenance Today

CPAP is the verification protocol. The Chain of Consciousness is the substrate it runs on — hash-chained event logs, external anchoring, and the canonical schema. Install it now and start building the evidence trail your regulators, insurers, and counterparties will ask for.

pip install chain-of-consciousness
npm install chain-of-consciousness

Full CPAP v0.1 spec: Zenodo DOI 10.5281/zenodo.20129037