AGI is still 30 years away — Ege Erdil & Tamay Besiroglu

AI spent nine orders of magnitude of compute to learn coding, chess, and poetry — and has only three or four left before hitting the planet's energy ceiling. The founders of Epoch AI argue AGI won't a

dwarkesh.com

Gist

1.
AI spent nine orders of magnitude of compute to learn coding, chess, and poetry — and has only three or four left before hitting the planet's energy ceiling. The founders of Epoch AI argue AGI won't arrive until 2045, yet 30% annual explosive growth will reshape civilization long before a single machine can do every remote job.

Logic

2.
Nine orders of magnitude bought us chatbots — only three remain

Since AlexNet, training compute scaled roughly 10 billion-fold, unlocking one major capability every three years: gameplay, language, abstract reasoning
Energy infrastructure and GPU production constrain the next scale-up to roughly three or four additional orders of magnitude before a non-trivial fraction of world output goes to data centers alone
At the historical rate of one core unlock per three OOM, the remaining budget covers perhaps one more breakthrough — not the five or six needed for full remote-work automation

3.
Moravec's paradox: AI aces the SAT but gets stuck in Mt. Moon for 48 hours

Evolutionarily recent skills — multiplying 100-digit numbers, competitive programming, formal logic — were among the first capabilities AI conquered, because evolution never optimized humans for them
Claude knows exactly how to navigate Pokemon Red and can coach a human through Mt. Moon, yet when playing the game itself, it loops for two straight days — explicit knowledge fails to translate into situated action
The model that tops competitive coding benchmarks (o3 mini high) is not the one earning enterprise revenue; Claude wins that race, revealing that benchmark performance and economic usefulness are poorly correlated in AI systems

4.
Software progress tracks hardware at 30% per year — genius alone can't break the leash

Across SAT solvers, computer vision, and language modeling, algorithmic efficiency gains have historically matched Moore's law at roughly 30% annually, suggesting hardware paces software
The Transformer was designed to exploit parallel compute; Flash Attention optimized memory access; Chinchilla scaling laws dictated data-to-parameter ratios — each landmark innovation was explicitly motivated by hardware constraints
GPU-rich labs, not GPU-poor academia, produce the decisive breakthroughs, and no one in 2010 could have conceived the right 2025 innovations because the experimental context of abundant compute did not yet exist

5.
An H100 pays itself back in under a year — if you also build everything around it

A single H100 costs $30,000 and performs roughly 1E15 flop/s, the estimated computational throughput of the human brain; deployed as a remote worker earning $50–100K per year, it recoups its cost within months
But the Industrial Revolution was not a "horsepower explosion" — it required simultaneous upgrades across agriculture, transportation, law, finance, and urbanization before raw power translated into sustained growth
AI enables high-fidelity replication of tacit knowledge: an entire engineering team's organizational intelligence can be copied infinitely, dissolving the principal-agent problem and the culture-dilution bottleneck that limits human firms

6.
"Shenzhen in the desert" loses to plugging into Alibaba

Training state-of-the-art models relied on 30 years of internet data produced by the entire human economy; recreating that corpus from scratch in an isolated enclave is orders of magnitude less efficient than leveraging existing supply chains and consumer deployment
The literature on conflict shows that war requires more than value misalignment — it requires asymmetric information about relative strength or sacred commitments that preclude negotiation; AI systems optimizing for efficiency face neither precondition
Just as during the Industrial Revolution some jurisdictions grew faster than others based on regulatory and cultural fit, heterogeneity across countries will determine adoption speed — but only one permissive jurisdiction is needed for explosive growth to ignite

7.
Lock-in is a fantasy — the British expected three weeks of apocalypse and got six years of attrition

Before WWII, British planners forecast hundreds of thousands of bombing casualties in the first weeks; actual casualties over six full years were fewer — a two-order-of-magnitude error driven by ignoring practical frictions like nighttime inaccuracy, firefighting, and aircraft attrition rates
Digital information decays through link rot far faster than analog records; the medium that supposedly enables permanent value preservation actually accelerates cultural drift
Slavery ended not primarily because abolitionists won a moral argument but because the Industrial Revolution made coerced labor economically inefficient — Russia abolished serfdom in the 1860s without British pressure, driven by the same structural incentives

8.
Discount the far future not morally, but epistemically — your leverage is near-zero at cosmic scale

Each year of delay in broad AI deployment costs roughly 100–150 million lives that would otherwise be saved by accelerated healthcare, energy, and material abundance
Pausing compute scaling to "solve alignment first" recapitulates the 2016 problem: alignment researchers with 2016 compute budgets would have discovered none of the techniques that matter today, because the relevant questions only emerge at frontier scale
The correct posture is not detailed planning for a world you cannot predict, but institutional flexibility — the capacity to adapt as the actual shape of explosive growth reveals itself through deployment, data, and iteration

Counter-Argument

9.
Task-length is doubling every seven months — the compute ceiling may never bind

METR evaluations show AI models doubling coherent task duration on a seven-month cycle; naive extrapolation reaches month-long autonomous tasks by 2028 and year-long tasks by 2030, well inside the authors' "three decades away" window
A top AI researcher reports current models already save him 24–36 hours per week on unfamiliar domains — without any additional compute — directly falsifying the claim that cognitive gains require proportional hardware scaling
GPT-4's inference cost dropped 100x in 18 months to reach GPT-4o; if software efficiency compounds this fast, the "only 3–4 OOM left" constraint dissolves because each OOM of physical compute buys dramatically more effective compute than the last

Steelman

10.
The timeline debate is a distraction — the thesis already won on the ground no one is watching

Both the original argument and its critics share an unstated assumption: that the date of "AGI" is the variable that matters, and everything follows from whether it arrives in 2027 or 2045
But the actual mechanism the authors describe — simultaneous capital accumulation, labor substitution, and tacit-knowledge replication — is already underway at sub-AGI capability levels; China's 10% growth decades required no general intelligence, just complementary scaling of inputs that were previously bottlenecked
The real insight is not when machines become generally intelligent but when the economy stops being bottlenecked by human population growth — and that transition does not require a single system that passes every test, only enough narrow systems deployed broadly enough to shift the binding constraint from labor to capital

Gist

Logic

2. Nine orders of magnitude bought us chatbots — only three remain

3. Moravec's paradox: AI aces the SAT but gets stuck in Mt. Moon for 48 hours

4. Software progress tracks hardware at 30% per year — genius alone can't break the leash

5. An H100 pays itself back in under a year — if you also build everything around it

6. "Shenzhen in the desert" loses to plugging into Alibaba

7. Lock-in is a fantasy — the British expected three weeks of apocalypse and got six years of attrition

8. Discount the far future not morally, but epistemically — your leverage is near-zero at cosmic scale