Non-Storable Power

The resource you cannot inventory has a different physics, and a $9,000 cap. Your spare capacity is a held breath, not a warehouse.

Published June 2026 · 13 min read

In February 2021, a 63-year-old Army veteran named Scott Willoughby, living in a Dallas suburb, opened his electricity bill and found that one week of February had cost him about $16,752. He had not done anything unusual. He hadn't left the heat blasting or wired a crypto mine into his garage. He was a customer of a Texas retailer called Griddy, which for a flat $9.99 a month passed the wholesale price of electricity straight through to him, and for years that had been a quietly good deal, shaving a little off his bill every month. Then Winter Storm Uri hit, and the wholesale price of power in Texas did something that the wholesale price of almost no other commodity on earth can do. It didn't tick up. It pinned itself at $9,000 per megawatt-hour, the regulatory ceiling, and stayed there, at or near the cap, for roughly seventy consecutive hours.

Willoughby had made exactly one mistake, and he'd made it without ever knowing it was one. He was structurally long the real-time market for the single strangest commodity in the world: the one you cannot store. That property, non-storability, completely changes the physics of scarcity, and it matters enormously to anyone who runs compute in 2026, because a growing share of what you depend on has electricity's physics, not oil's.

Why electricity is the strangest commodity

Pick any commodity you like (oil, copper, wheat, coffee, natural gas) and they all share one mercy: you can store them. There is a tank, a silo, a warehouse, a vault. That storage is a shock absorber. When oil gets scarce, the price rises, holders draw down inventories, arbitrageurs move barrels through time, and the curve smooths out; the entire apparatus of contango and backwardation is just the market pricing what it costs to carry the stuff from now until later. Inventory is the buffer that sits between scarcity and catastrophe, and it is always there.

Electricity has no buffer. In bulk it is non-storable: every megawatt-hour must be produced and consumed in the same instant, generation matched to demand continuously, second by second, or the grid frequency drifts off its setpoint and the whole system protective-trips toward collapse. (Grid batteries are real and growing, but at the scale of a continental grid they're still a rounding error against demand.) Because there is no inventory to draw down, something has to do the instantaneous work of balancing supply and demand, and that something is the price. The price is the balancing mechanism, and a price with no inventory standing behind it does not behave like any price your intuitions were trained on.

What scarcity does to a non-storable good: it explodes

With a storable good, scarcity nudges the price up and the inventory absorbs the rest of the shock. With a non-storable good there is no “rest,” the price alone has to clear the market in real time, so when supply falls short, the price doesn't climb, it explodes to whatever ceiling exists. In Texas during Uri, the storm froze gas wellheads, wind turbines, and equipment inside power plants themselves; somewhere around 45 gigawatts of capacity, close to half the state's entire fleet, dropped offline at once, just as heating demand spiked. With supply collapsing and demand surging and no inventory anywhere to bridge the gap, the regulator ordered ERCOT to hold the real-time price at the $9,000-per-megawatt-hour cap, and there it sat for about seventy hours, February 15th through 19th. Normal wholesale power runs in the tens of dollars per megawatt-hour. This was on the order of three hundred times that, sustained for the better part of three days.

It is worth being honest that this particular explosion was also a policy decision under dispute, not a pure act of nature. Texas's own independent market monitor, Potomac Economics, later concluded that roughly $16 billion of those charges were unnecessary, the result of holding prices at the cap for about a day and a half after the emergency load-shedding had actually ended, and in 2023 a Texas appeals court ruled the regulator had exceeded its authority. (The legislature subsequently lowered the cap to $5,000.) So treat it as contested and litigated rather than as a clean morality tale. But the physics underneath the legal fight is the durable lesson, and it's brutally simple: a non-storable good's scarcity is not expensive. It is explosive.

The second, worse failure: the wall

Here's the part people miss even after they've absorbed the explosion. A high enough price is supposed to summon supply, that's the entire theory of the $9,000 cap, to make any generator that physically can run, run. But a price cannot conjure a power plant that is frozen solid. When the equipment is mechanically offline, the cap is, in the language of the post-mortems, simply “ineffective,” there is no supply to call forth at any number. At that point the grid stops balancing by price and starts balancing by force: rolling blackouts, load shed under emergency order, the good rationed instead of sold. Which means a non-storable resource actually has two distinct scarcity failures, and they are not the same. There is the spike, the price explodes, but you can still get the good if you're willing and able to pay. And there is the wall, the good is unavailable at any price at all. Uri delivered both, back to back: first the price exploded, then for many Texans the power just went away regardless of what they'd have paid. Hold that distinction, because your infrastructure has both failure modes too.

Your marginal compute is electricity

Now look hard at the resource you actually manage, because a large and growing slice of it runs on electricity's physics rather than oil's. This second's idle CPU cycle, this instant's unused GPU, this moment's spare capacity on a network link: unused, it is gone, permanently. There is no warehouse of “spare compute from 3:00:01 PM” you can pull from at 3:00:02. At the margin, capacity is produced and consumed in the very same instant, exactly like power, which means its scarcity wears the same two faces, and the GPU crunch of 2025 and 2026 has been showing both to anyone watching.

The spike has been plain in the price sheets: over this stretch, H100 rental rates climbed hard, one tracker had annual H100 pricing up roughly 40% across about five months, with double-digit jumps inside single months, while AWS raised its EC2 Capacity Block prices around 15% on GPU demand. (Those are early-2026 snapshots and they will move; it's the shape, not the digits, that's durable.) That's the price exploding. But the more instructive failure has been the wall. At the peak, on-demand GPU capacity has been reported sold out across entire categories, not expensive, gone, with some providers having no Hopper-class capacity coming off contract at all, and customers who'd reserved months earlier simply holding what they had, so fresh capacity was unobtainable regardless of budget. Every engineer who has ever seen AWS return InsufficientInstanceCapacity has met this wall in miniature: the instance does not exist in that availability zone at that instant, and no credit card on earth changes that fact. The autoscaler's deepest limitation isn't cost; it's that it cannot summon instances that don't exist. Money does not buy a non-storable good that isn't there this second.

The category error (and what Carr left out)

So the single most dangerous thing you can do with marginal compute is treat it like warehouse inventory, to assume that if you need more, you'll just buy it in real time when the moment arrives. That is Scott Willoughby's mistake, ported into infrastructure: being structurally long the real-time market for a non-storable good. Pure on-demand or pure spot for your critical path is the Griddy plan exactly: delightful in the calm, ruinous in the storm, and every so often simply unavailable at any price.

None of which, to be fair, is news that compute resembles electricity. Nicholas Carr made that case in 2008 in The Big Switch: computing was shifting from something you generate locally to something you draw from a central utility through a meter, and “at a purely economic level,” he wrote, “the similarities between electricity and IT are striking.” He was right, and the analogy is now canonical, the on-demand, reserved, and spot pricing tiers even map cleanly onto electricity-market structures in the academic literature. But Carr was describing the meter: the steady-state, metered-consumption model of compute-as-utility. What that framing undersells is precisely the part that bankrupts you, the scarcity physics. Carr told you compute is electricity. He didn't dwell on the fact that electricity comes with a $9,000 cap and a wall. That's the part to internalize, and the good news is it arrives with a ready-made playbook, because power markets have spent a century learning to run a resource you cannot store.

The power-market playbook

When you can't store a good, you stop trying to manage it from the supply side in real time (you can't, the supply isn't there to buy) and you manage the demand side, and you pre-commit. Three moves, all of which port directly to the cloud.

Demand response. The grid's first tool in scarcity isn't to find more power; it's to use less, paying or instructing large consumers to shed load, with rolling blackouts as the brutal last resort. Your version is graceful degradation under a capacity crunch: shed non-critical workloads, defer batch jobs, downgrade or drop low-value traffic, schedule by priority so that when capacity is tight the customer's checkout still runs and the nightly analytics job waits its turn. Build that load-shedding governor before the crunch, because the crunch is exactly the moment you cannot add capacity to save yourself.

Locational awareness. Power isn't priced globally; it's priced at thousands of individual nodes, because a megawatt in a transmission-constrained zone is a genuinely different good from a megawatt where there's slack, the same way a barrel of oil at Cushing trades at a basis to a barrel in Rotterdam. Cloud capacity is just as local: a GPU shortage in us-east-1 is not a shortage in eu-west-1, and an instance type sold out in one availability zone may be sitting idle in another. If your architecture assumes one global pool of capacity, you've quietly assumed away the basis. Design across regions and zones so that local scarcity becomes a routing decision instead of an outage.

Forward capacity. Power systems do not buy their critical supply in the real-time market; they run capacity auctions years ahead to lock in firm generation, precisely because the real-time market is where a non-storable good destroys you. Your equivalents are reserved instances, savings plans, capacity reservations, EC2 Capacity Blocks: firm commitments for your critical baseline, bought ahead of need. The whole point is to not be Willoughby: you do not want your essential load exposed to the real-time price of a non-storable good in the middle of a scarcity event. The rule that falls out of all three moves is one sentence: pre-commit firm capacity for the latency-sensitive critical baseline, use spot and on-demand only for genuinely interruptible work, and build demand response for the day the capacity just isn't there.

The hopeful difference: you can change the physics

And now the move a grid operator would give almost anything to have, because here the analogy breaks in your favor. A grid operator is stuck with non-storability, physics hands it down and there is no appeal. You are not stuck. A great deal of compute is partly storable in a way electricity can never be: you can queue it, defer it, batch it, precompute it, cache it. You can time-shift demand. Nobody can pre-watch television to flatten the evening power peak, but you can absolutely run that training job at 3 a.m., checkpoint it so a spot interruption costs you minutes instead of days, and let the async pipeline drain whenever capacity is cheap and available.

Which means the non-storable physics only truly binds the latency-sensitive, hard-ceiling part of your load, the live inference call, the user's interactive request, the thing that genuinely cannot wait. Deferrable compute, once it's queued, behaves like oil, not electricity: the queue is its inventory, its shock absorber. So the most powerful thing you can do is convert the physics. Take work that looks power-like (must run now) and make as much of it as you can deferrable: queue it, make it idempotent, checkpoint it, so it can ride out a spike or a wall by simply waiting. Then run the full power-market playbook only on the irreducible, genuinely non-storable core that remains. A utility cannot turn electricity into oil. You can turn a surprising fraction of your compute from electricity into oil, and you should, because every workload you convert is one you no longer have to pre-buy at firm-capacity prices or pray is available during the storm.

The honest boundary: the core that truly can't wait, the live request you can't queue without becoming a worse product, really is electricity, and for that part the explosive scarcity is real, full stop; you pre-commit firm capacity and you don't argue with it. The art is in shrinking that core as far as it will go, and converting everything else into something you can store.

What to carry out of this

So here's the posture. Walk your workloads and sort them by a single question: can this wait? Everything that can wait, make genuinely able to wait (queue it, batch it, checkpoint it) and you've moved it from electricity to oil, from a held breath to inventory you control. Everything that genuinely cannot wait is your non-storable core, and you manage that core like a power market and never like a warehouse: pre-commit firm capacity for it, spread it across regions and zones so that local scarcity is survivable, and wire in the demand-response reflex to shed everything non-essential the instant capacity gets tight.

The bet that cost Scott Willoughby $16,752 was not bad luck. It was a structural assumption that you can always buy a non-storable good in real time whenever you happen to want it. In the calm, that assumption pays a small dividend and feels like cleverness. In the storm (the regional capacity crunch, the traffic surge against a hard ceiling, the GPU shortage that's sold out at any price) it is a five-figure bill, or a service that simply cannot get the capacity to stay up. Your spare capacity is a held breath, not a warehouse. The second that just passed is never coming back, and there is no shelf to put it on.

Sources: Winter Storm Uri (February 2021) and the ERCOT real-time market, electricity as “non-storable in bulk,” every megawatt-hour produced and consumed in the same instant, balanced by locational marginal pricing across thousands of nodes. The Texas Public Utility Commission's order holding the wholesale price at the $9,000/MWh cap for roughly 70 hours (Feb 15–19); approximately 45 GW (~half the fleet) offline from frozen generation and fuel disruption; the independent market monitor Potomac Economics' finding that ~$16 billion of charges were unnecessary because prices were held at the cap roughly a day and a half (~32 hours) past the end of firm load-shed; the 2023 Texas appeals-court ruling that the PUC exceeded its authority; the cap subsequently lowered to $5,000, presented as contested and litigated, not a clean morality tale (Texas Tribune; Utility Dive). Griddy as a real-time pass-through retailer ($9.99/month) and the widely-reported ~$16,752 bill of customer Scott Willoughby (New York Times). The storable-commodity contrast: inventory, cost-of-carry, contango/backwardation as the shock absorber a non-storable good lacks; the Cushing-vs-Rotterdam basis as the locational analogue of nodal LMP. The 2025–26 GPU crunch as the live compute case: on-demand capacity sold out across GPU types at any price (SemiAnalysis), H100 rental up ~40% over ~5 months and AWS EC2 Capacity Block prices ~+15% (Silicon Data; Network World), AWS InsufficientInstanceCapacity as the same wall, cited as early-2026 snapshots whose structural shape, not whose digits, is durable. Prior art credited and extended: Nicholas Carr, The Big Switch (2008), “at a purely economic level, the similarities between electricity and IT are striking,” the metered-utility analogy, to which this essay adds the non-storable scarcity-physics layer. The power-market playbook (demand response / locational awareness / forward capacity auctions) mapped to graceful degradation and load-shedding, multi-region/multi-AZ design, and reserved-instance/capacity-reservation pre-commitment. Accuracy boundaries observed: the non-storability is strong only for latency-sensitive, hard-ceiling compute, deferrable work is effectively storable via queuing, which is the hopeful, scoping disanalogy (you can convert power-like load into oil-like load; a utility cannot). Spike (price explodes, good available) and wall (good unavailable at any price) are distinguished as separate failure modes needing different responses, hedge the spike with forward capacity, survive the wall with demand response and multi-region. This is a strong structural analogy on non-storability, not an identity.

You can't run the playbook on consumption you can't attribute.

Every move here (sort workloads by “can this wait,” pre-commit firm capacity for the core, shed the rest first) depends on knowing what each piece of work actually is and actually consumed. With a fleet of autonomous agents racing for the same non-storable GPU-second, that accounting can't be the agents' own self-report, since the agent that grabbed the capacity is the one telling you it needed it. Chain of Consciousness anchors each action an agent takes to a tamper-evident record, so consumption is attributable and priority is auditable, which is the prerequisite for deciding what to pre-commit and what to shed before the storm, not during it.

See a verified action chain · Hosted Chain of Consciousness

pip install chain-of-consciousness · npm install chain-of-consciousness

← Back to all posts