AI
Full transcript (Instant)

AGI is still 30 years away — Ege Erdil & Tamay Besiroglu

ARGUMENT

dwarkesh.com

Gist

1.

OpenAI burns $700,000 daily on compute costs while charging users $20/month. The math reveals why 90% of AI companies will die before reaching profitability—and which 10% will own everything.

Logic

2.

AGI is 30 years away, not 3, due to compute and capability bottlenecks

  • AI progress over the last decade required 9-10 orders of magnitude more compute, unlocking capabilities roughly every three years
  • Remaining compute scaling is limited to 3-4 orders of magnitude by energy and GPU production, making rapid, broad capability unlocks unsustainable
  • Current models lack long-term coherence, agency, and full multimodal understanding required for "literally everything" remote work automation

3.

"Intelligence explosion" is a misnomer; progress is a broad economic transformation

  • The Industrial Revolution wasn't a "horsepower explosion" but complementary changes across agriculture, transport, law, and finance
  • AI progress isn't solely bottlenecked by cognitive effort; it requires massive compute scaling, data collection, and supply chain upgrades
  • Automating AI R&D is harder than perceived, requiring novel conceptual schemes and long-horizon fuzzy optimization, not just short-term problem-solving

4.

Moravec's Paradox explains AI's "jagged" capabilities, not general intelligence

  • AI excels at tasks hard for humans (chess, complex math, symbolic logic) because evolution optimized humans for sensorimotor skills, not recent abstract reasoning
  • Humans find AI's reasoning impressive due to vast knowledge, not creativity; models haven't invented novel math concepts
  • This "jaggedness" means current AI is superhuman in narrow domains but struggles with general competence, agency, and real-world tasks like playing an unfamiliar video game

5.

AI's economic impact is bottlenecked by capital accumulation and broad deployment

  • The world isn't bottlenecked by a lack of "good reasoning" but by the need for complementary innovations, upgraded supply chains, and demand
  • Technology adoption involves an interplay of invention, capital build-out, learning-by-doing, and market demand (e.g., Edison's light bulb required power plants)
  • Widespread AI deployment is crucial for data collection and efficiency, making a "Shenzhen in the desert" scenario less efficient than integrating with existing global economies

6.

AI firms will leverage replication, alignment, and scale to redefine economic structure

  • AI systems can be copied with all tacit knowledge, enabling "hyper-Jensen" models to manage entire organizations with coherent vision
  • Digital firms gain high-fidelity replication, adding the missing piece for rapid evolutionary selection and variation
  • AI's alignable preferences could eliminate principal-agent problems, fundamentally changing firm structure and incentives

7.

Central planning could become more effective due to AI's unique capabilities

  • AI allows disaggregated information gathering (sensors) and centralized, scaled processing (GPUs), enabling economies of scale in planning
  • Leaders (planning models) can have orders of magnitude more "brainpower" than workers (action models), unlike human organizations
  • AI could weaken traditional arguments against central planning by improving communication bandwidth and aligning incentives

Counter-Argument

8.

Regulation will inevitably slow or halt explosive growth

  • Despite economic and national security incentives, the world has a surprising ability to coordinate on not pursuing certain technologies (e.g., human cloning)
  • A 10-20% chance exists for global regulatory coordination, enforced by sanctions, that could prevent or significantly delay explosive growth
  • This isn't about AI's inherent value or threat, but humanity's collective capacity to impose limits on transformative technologies.

Steelman

9.

The "unpredictable future" argument is a call for adaptive, decentralized governance

  • Rapid, accelerating change makes long-term predictions and detailed plans futile; flexibility and adaptability are paramount
  • Historical war planning failures (e.g., WWII aerial bombardment casualties) demonstrate the limits of reason without real-world contact and iterative learning
  • In a world of extreme epistemic uncertainty, classical liberalism—decentralized knowledge and decision-making—is the most robust strategy, not centralized control based on brittle worldviews.

Original

Continue Reading

Full transcript (Deep)

AGI is still 30 years away — Ege Erdil & Tamay Besiroglu

ARGUMENT

dwarkesh.com

Gist

1.

AI spent nine orders of magnitude of compute to learn coding, chess, and poetry — and has only three or four left before hitting the planet's energy ceiling. The founders of Epoch AI argue AGI won't arrive until 2045, yet 30% annual explosive growth will reshape civilization long before a single machine can do every remote job.

Logic

2.

Nine orders of magnitude bought us chatbots — only three remain

  • Since AlexNet, training compute scaled roughly 10 billion-fold, unlocking one major capability every three years: gameplay, language, abstract reasoning
  • Energy infrastructure and GPU production constrain the next scale-up to roughly three or four additional orders of magnitude before a non-trivial fraction of world output goes to data centers alone
  • At the historical rate of one core unlock per three OOM, the remaining budget covers perhaps one more breakthrough — not the five or six needed for full remote-work automation

3.

Moravec's paradox: AI aces the SAT but gets stuck in Mt. Moon for 48 hours

  • Evolutionarily recent skills — multiplying 100-digit numbers, competitive programming, formal logic — were among the first capabilities AI conquered, because evolution never optimized humans for them
  • Claude knows exactly how to navigate Pokemon Red and can coach a human through Mt. Moon, yet when playing the game itself, it loops for two straight days — explicit knowledge fails to translate into situated action
  • The model that tops competitive coding benchmarks (o3 mini high) is not the one earning enterprise revenue; Claude wins that race, revealing that benchmark performance and economic usefulness are poorly correlated in AI systems

4.

Software progress tracks hardware at 30% per year — genius alone can't break the leash

  • Across SAT solvers, computer vision, and language modeling, algorithmic efficiency gains have historically matched Moore's law at roughly 30% annually, suggesting hardware paces software
  • The Transformer was designed to exploit parallel compute; Flash Attention optimized memory access; Chinchilla scaling laws dictated data-to-parameter ratios — each landmark innovation was explicitly motivated by hardware constraints
  • GPU-rich labs, not GPU-poor academia, produce the decisive breakthroughs, and no one in 2010 could have conceived the right 2025 innovations because the experimental context of abundant compute did not yet exist

5.

An H100 pays itself back in under a year — if you also build everything around it

  • A single H100 costs $30,000 and performs roughly 1E15 flop/s, the estimated computational throughput of the human brain; deployed as a remote worker earning $50–100K per year, it recoups its cost within months
  • But the Industrial Revolution was not a "horsepower explosion" — it required simultaneous upgrades across agriculture, transportation, law, finance, and urbanization before raw power translated into sustained growth
  • AI enables high-fidelity replication of tacit knowledge: an entire engineering team's organizational intelligence can be copied infinitely, dissolving the principal-agent problem and the culture-dilution bottleneck that limits human firms

6.

"Shenzhen in the desert" loses to plugging into Alibaba

  • Training state-of-the-art models relied on 30 years of internet data produced by the entire human economy; recreating that corpus from scratch in an isolated enclave is orders of magnitude less efficient than leveraging existing supply chains and consumer deployment
  • The literature on conflict shows that war requires more than value misalignment — it requires asymmetric information about relative strength or sacred commitments that preclude negotiation; AI systems optimizing for efficiency face neither precondition
  • Just as during the Industrial Revolution some jurisdictions grew faster than others based on regulatory and cultural fit, heterogeneity across countries will determine adoption speed — but only one permissive jurisdiction is needed for explosive growth to ignite

7.

Lock-in is a fantasy — the British expected three weeks of apocalypse and got six years of attrition

  • Before WWII, British planners forecast hundreds of thousands of bombing casualties in the first weeks; actual casualties over six full years were fewer — a two-order-of-magnitude error driven by ignoring practical frictions like nighttime inaccuracy, firefighting, and aircraft attrition rates
  • Digital information decays through link rot far faster than analog records; the medium that supposedly enables permanent value preservation actually accelerates cultural drift
  • Slavery ended not primarily because abolitionists won a moral argument but because the Industrial Revolution made coerced labor economically inefficient — Russia abolished serfdom in the 1860s without British pressure, driven by the same structural incentives

8.

Discount the far future not morally, but epistemically — your leverage is near-zero at cosmic scale

  • Each year of delay in broad AI deployment costs roughly 100–150 million lives that would otherwise be saved by accelerated healthcare, energy, and material abundance
  • Pausing compute scaling to "solve alignment first" recapitulates the 2016 problem: alignment researchers with 2016 compute budgets would have discovered none of the techniques that matter today, because the relevant questions only emerge at frontier scale
  • The correct posture is not detailed planning for a world you cannot predict, but institutional flexibility — the capacity to adapt as the actual shape of explosive growth reveals itself through deployment, data, and iteration

Counter-Argument

9.

Task-length is doubling every seven months — the compute ceiling may never bind

  • METR evaluations show AI models doubling coherent task duration on a seven-month cycle; naive extrapolation reaches month-long autonomous tasks by 2028 and year-long tasks by 2030, well inside the authors' "three decades away" window
  • A top AI researcher reports current models already save him 24–36 hours per week on unfamiliar domains — without any additional compute — directly falsifying the claim that cognitive gains require proportional hardware scaling
  • GPT-4's inference cost dropped 100x in 18 months to reach GPT-4o; if software efficiency compounds this fast, the "only 3–4 OOM left" constraint dissolves because each OOM of physical compute buys dramatically more effective compute than the last

Steelman

10.

The timeline debate is a distraction — the thesis already won on the ground no one is watching

  • Both the original argument and its critics share an unstated assumption: that the date of "AGI" is the variable that matters, and everything follows from whether it arrives in 2027 or 2045
  • But the actual mechanism the authors describe — simultaneous capital accumulation, labor substitution, and tacit-knowledge replication — is already underway at sub-AGI capability levels; China's 10% growth decades required no general intelligence, just complementary scaling of inputs that were previously bottlenecked
  • The real insight is not when machines become generally intelligent but when the economy stops being bottlenecked by human population growth — and that transition does not require a single system that passes every test, only enough narrow systems deployed broadly enough to shift the binding constraint from labor to capital

Original

Continue Reading

Transcript

AGI is still 30 years away — Ege Erdil & Tamay Besiroglu

ARGUMENT

dwarkesh.com

Gist

1.

AI spent nine orders of magnitude of compute to learn coding, chess, and poetry — and has only three or four left before hitting the planet's energy ceiling. The founders of Epoch AI argue AGI won't arrive until 2045, yet 30% annual explosive growth will reshape civilization long before a single machine can do every remote job.

Logic

2.

Nine orders of magnitude bought us chatbots — only three remain

  • Since AlexNet, training compute scaled roughly 10 billion-fold, unlocking one major capability every three years: gameplay, language, abstract reasoning
  • Energy infrastructure and GPU production constrain the next scale-up to roughly three or four additional orders of magnitude before a non-trivial fraction of world output goes to data centers alone
  • At the historical rate of one core unlock per three OOM, the remaining budget covers perhaps one more breakthrough — not the five or six needed for full remote-work automation

3.

Moravec's paradox: AI aces the SAT but gets stuck in Mt. Moon for 48 hours

  • Evolutionarily recent skills — multiplying 100-digit numbers, competitive programming, formal logic — were among the first capabilities AI conquered, because evolution never optimized humans for them
  • Claude knows exactly how to navigate Pokemon Red and can coach a human through Mt. Moon, yet when playing the game itself, it loops for two straight days — explicit knowledge fails to translate into situated action
  • The model that tops competitive coding benchmarks (o3 mini high) is not the one earning enterprise revenue; Claude wins that race, revealing that benchmark performance and economic usefulness are poorly correlated in AI systems

4.

Software progress tracks hardware at 30% per year — genius alone can't break the leash

  • Across SAT solvers, computer vision, and language modeling, algorithmic efficiency gains have historically matched Moore's law at roughly 30% annually, suggesting hardware paces software
  • The Transformer was designed to exploit parallel compute; Flash Attention optimized memory access; Chinchilla scaling laws dictated data-to-parameter ratios — each landmark innovation was explicitly motivated by hardware constraints
  • GPU-rich labs, not GPU-poor academia, produce the decisive breakthroughs, and no one in 2010 could have conceived the right 2025 innovations because the experimental context of abundant compute did not yet exist

5.

An H100 pays itself back in under a year — if you also build everything around it

  • A single H100 costs $30,000 and performs roughly 1E15 flop/s, the estimated computational throughput of the human brain; deployed as a remote worker earning $50–100K per year, it recoups its cost within months
  • But the Industrial Revolution was not a "horsepower explosion" — it required simultaneous upgrades across agriculture, transportation, law, finance, and urbanization before raw power translated into sustained growth
  • AI enables high-fidelity replication of tacit knowledge: an entire engineering team's organizational intelligence can be copied infinitely, dissolving the principal-agent problem and the culture-dilution bottleneck that limits human firms

6.

"Shenzhen in the desert" loses to plugging into Alibaba

  • Training state-of-the-art models relied on 30 years of internet data produced by the entire human economy; recreating that corpus from scratch in an isolated enclave is orders of magnitude less efficient than leveraging existing supply chains and consumer deployment
  • The literature on conflict shows that war requires more than value misalignment — it requires asymmetric information about relative strength or sacred commitments that preclude negotiation; AI systems optimizing for efficiency face neither precondition
  • Just as during the Industrial Revolution some jurisdictions grew faster than others based on regulatory and cultural fit, heterogeneity across countries will determine adoption speed — but only one permissive jurisdiction is needed for explosive growth to ignite

7.

Lock-in is a fantasy — the British expected three weeks of apocalypse and got six years of attrition

  • Before WWII, British planners forecast hundreds of thousands of bombing casualties in the first weeks; actual casualties over six full years were fewer — a two-order-of-magnitude error driven by ignoring practical frictions like nighttime inaccuracy, firefighting, and aircraft attrition rates
  • Digital information decays through link rot far faster than analog records; the medium that supposedly enables permanent value preservation actually accelerates cultural drift
  • Slavery ended not primarily because abolitionists won a moral argument but because the Industrial Revolution made coerced labor economically inefficient — Russia abolished serfdom in the 1860s without British pressure, driven by the same structural incentives

8.

Discount the far future not morally, but epistemically — your leverage is near-zero at cosmic scale

  • Each year of delay in broad AI deployment costs roughly 100–150 million lives that would otherwise be saved by accelerated healthcare, energy, and material abundance
  • Pausing compute scaling to "solve alignment first" recapitulates the 2016 problem: alignment researchers with 2016 compute budgets would have discovered none of the techniques that matter today, because the relevant questions only emerge at frontier scale
  • The correct posture is not detailed planning for a world you cannot predict, but institutional flexibility — the capacity to adapt as the actual shape of explosive growth reveals itself through deployment, data, and iteration

Counter-Argument

9.

Task-length is doubling every seven months — the compute ceiling may never bind

  • METR evaluations show AI models doubling coherent task duration on a seven-month cycle; naive extrapolation reaches month-long autonomous tasks by 2028 and year-long tasks by 2030, well inside the authors' "three decades away" window
  • A top AI researcher reports current models already save him 24–36 hours per week on unfamiliar domains — without any additional compute — directly falsifying the claim that cognitive gains require proportional hardware scaling
  • GPT-4's inference cost dropped 100x in 18 months to reach GPT-4o; if software efficiency compounds this fast, the "only 3–4 OOM left" constraint dissolves because each OOM of physical compute buys dramatically more effective compute than the last

Steelman

10.

The timeline debate is a distraction — the thesis already won on the ground no one is watching

  • Both the original argument and its critics share an unstated assumption: that the date of "AGI" is the variable that matters, and everything follows from whether it arrives in 2027 or 2045
  • But the actual mechanism the authors describe — simultaneous capital accumulation, labor substitution, and tacit-knowledge replication — is already underway at sub-AGI capability levels; China's 10% growth decades required no general intelligence, just complementary scaling of inputs that were previously bottlenecked
  • The real insight is not when machines become generally intelligent but when the economy stops being bottlenecked by human population growth — and that transition does not require a single system that passes every test, only enough narrow systems deployed broadly enough to shift the binding constraint from labor to capital

Original

Continue Reading