The Free Fall — How AI Got Cheaper Than Your Morning Coffee

Cost per 1M tokens — frontier LLM inference (GPT-4-class)

2022

~$60

2023

~$15

2024

~$2

2025

<$0.10

Sources: Epoch AI; Artificial Analysis; published OpenAI / Anthropic pricing. Normalized to GPT-4-class capability.
The 2025 bar is not a rounding error — that is the whole point.

Let me tell you about the most important price collapse you've probably never heard of. Not oil in 2020. Not housing. Not the great crypto implosion that made a lot of very loud people suddenly very quiet. I'm talking about the cost of thinking — specifically, the cost of getting a machine to think on your behalf — and how it has dropped so fast that the economists who model this sort of thing are still frantically updating their spreadsheets.

In early 2022, reaching a frontier large language model cost roughly $60 per million tokens — the unit of text these models process. A million tokens is about 750,000 words, or ten novels. So for the price of chewing through ten novels, you paid sixty dollars. Sounds almost reasonable, until you realize a serious application might consume millions of tokens a day, at which point your accountant develops a facial tic.

99.9%

The drop in frontier AI inference cost between early 2022 and mid-2025. An equivalent collapse in airfare would put a New York–London flight at about 35 cents.

By mid-2025, that same frontier-class processing cost less than ten cents per million tokens.^[1] Not a sale. Not a promotional rate. A 99.9% collapse in the cost of machine intelligence in under four years — a pace that makes Moore's Law look like it was negotiated by a committee on a long lunch.

The Slide Rule Nobody Noticed

When I was building FedMine — my federal spending intelligence platform, which I mostly built alone in a room with too much coffee and not nearly enough sleep — I used to daydream about an AI that could read and parse government procurement documents in real time. The technology sort of existed. But the cost was prohibitive: you'd spend more on the thinking than on the data you were thinking about. That world is gone. The curve didn't bend. It broke.

We are not watching AI get better. We are watching intelligence itself become a commodity — as abundant, and eventually as cheap, as electricity.

Three forces drive the free fall. First, algorithmic efficiency: researchers keep finding ways to squeeze more performance from the same compute, so a model that needed a thousand GPUs in 2022 runs on a fraction of that today.^[2] Second, hardware learning curves: purpose-built inference chips deliver yesterday's training-grade performance at today's commodity prices. Third — the one the venture crowd discusses only in lowered voices — competition. When DeepSeek shipped its R1 model in January 2025 with frontier-class reasoning at a fraction of the expected training cost, it didn't merely rattle markets. It rewrote the cost floor for the entire industry.^[3]

📊 See the numbers behind the race. My Global AI Power Index tracks the electricity, compute, and capital that thirteen economies are pouring into the intelligence era.

What Happened to Steam

To see where this leads, look at the last time a general-purpose technology got dramatically cheaper. Steam power in the early 19th century didn't just make trains faster. It reorganized the entire economy — where factories stood, where people lived, which cities rose and which quietly emptied. The cheap energy wasn't the product. It was the platform on which entirely new products became possible.

AI cost deflation behaves the same way, but the feedback loops are shorter and stranger. When thinking gets cheap enough, every task that once required human cognition becomes a candidate for automation — not because the machine is wiser than the human (though in narrow domains it increasingly is) but because it's suddenly available at a price that changes the make-or-buy decision for millions of tasks that previously had exactly one answer.^[4]

I want to be careful here, because this is the fork where AI writers go either catastrophically doom or suspiciously utopian. The cheap steam engine didn't abolish human labor; it transformed it. The same arithmetic applies to cognitive work. The tasks that vanish are the essentially mechanical ones — data entry, routine document review, templated code, boilerplate prose. What remains, and expands, is the work that needs context, judgment, and the kind of hard-won domain expertise that takes decades to accumulate and about four seconds to undervalue.

The Asymmetric Beneficiary Problem

Here's where I get uncomfortable — and where the book tries to be honest rather than merely optimistic. Cost curves that collapse this fast do not distribute their gains evenly. They never have.

When compute got cheap enough to run a server on a laptop, it was a gift to every startup that couldn't afford a data center. Cloud computing lowered the barrier again. Each wave democratized access — up to a point. But the firms that arrived first, with the capital to build at scale before prices fell, captured the value asymmetrically. The pattern is consistent enough to be almost boring: the technology democratizes; the economics concentrate.

$320B

Committed AI infrastructure spend by US hyperscalers in 2025 alone — more than the entire GDP of Portugal.^[5]

This is the paradox I spend real time on in the wealth and inequality chapter — which everyone who has read it describes as "depressing but important," in that order. Intelligence is getting cheap for everyone. But the infrastructure that produces cheap intelligence — the chips, the data centers, the training runs — is concentrating, not dispersing. You can call an API for a tenth of a cent. You cannot build the machine behind that API without billions of dollars and a relationship with TSMC that most governments would quietly trade a province for.

Where This Goes Next

People ask whether costs will keep falling. My answer: yes, for a while, and then it gets complicated. The efficiency gains are real and ongoing; the hardware roadmap points to continued reductions through 2027 at least.^[6] But there are floors. Energy costs are not falling. Data-center land near functioning power grids is not abundant. The rare earths in advanced chips are not democratizing. At some point — nobody agrees exactly when — the curve hits a floor set not by algorithmic cleverness but by the physical world's stubbornness.

When that happens, advantage shifts from whoever makes AI cheaper to whoever figured out, during the cheap years, how to make it genuinely useful at scale. That's the race actually underway right now. And almost nobody is watching it.

The question is not how cheap AI will get. It's who decides what it does once it's so cheap it becomes invisible.

There's something almost poetic in all this, if you go looking for poetry in semiconductor economics — and I confess I do. We spent decades teaching machines to imitate thought. Now thought, or a convincing approximation, costs less than a grain of salt per query. We ought to figure out what to do with that before the next generation of models makes this entire conversation feel quaint.

I'm working on it. You're reading part of the research. Come back in three days — there will be more.

Sources & Further Reading

Epoch AI, Trends in the Cost of Language Model Inference (2025). epoch.ai ↗ ↩
Artificial Analysis, AI Model Benchmark & Pricing Comparison, updated monthly (2026). artificialanalysis.ai ↗ ↩
DeepSeek-AI, DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (Jan 2025). arxiv.org ↗ ↩
Erik Brynjolfsson & Andrew McAfee, The Second Machine Age, W.W. Norton (2014). ↩
Goldman Sachs Research, AI Infrastructure Spending Outlook, Q1 2026. goldmansachs.com ↗ ↩
NVIDIA, GTC 2025 Keynote — Rubin Architecture Overview (March 2025). nvidia.com ↗ ↩

← All Essays Next Essay → (soon)