Scaling AI Is a Physical Problem.

The internet taught people the wrong intuition: that scaling is mostly software.

You ship code, spin up instances, buy ads, watch graphs move. The constraints feel abstract—latency, bandwidth, product-market fit, the willingness of users to click.

AI scaling is different. It is an industrial build. The bottleneck isn’t imagination. It’s power, heat, supply chains, and the calendar.

The stack, from atoms to model weights

The model is the visible tip. The work is everything beneath it:

  • Power: generation, transmission, and the right substation in the right place
  • Interconnect: permission to draw megawatts from the grid, plus the upgrades that make it physically possible
  • Transformers & switchgear: the heavy electrical gear that turns “available power” into usable power on-site
  • Cooling: air, water, and the physics of moving heat out of dense rooms
  • Compute: GPUs/accelerators, networking, memory, racks, spares
  • Construction: permitting, land, concrete, steel, schedule risk
  • Ops: uptime, staffing, security, incident response, maintenance windows

If you want an “AI thesis,” start here. “Compute” is not a number in a spreadsheet. It is an infrastructure project.

What hyperscalers are actually buying

Not GPUs. Time.

They’re buying:

  • time to permit and build
  • time to secure a grid interconnect
  • time for transformers, breakers, switchgear, and cabling to arrive
  • time to commission, harden, and operate the site

This is why a single constraint can dominate an entire year: the thing you can’t expedite is the thing you end up funding.

The key asymmetry

Software scales in days.

Industrial capacity scales in years.

That mismatch creates the investment surface area: who gets paid while the bottleneck clears, and who gets stuck holding the schedule.

The tell is that “capacity” is less a function of model architecture than of project timelines. The calendar is the scarce resource.

Bottleneck 1: grid interconnect (permission to exist)

The popular story is “we need more GPUs.” But a GPU is inert until it’s attached to a place that can deliver power continuously.

Interconnection queues are the clearest public artifact of this reality. In the U.S., interconnection requests total thousands of gigawatts—a volume that dwarfs what can plausibly be built quickly, and a reminder that the pipeline is constrained long before anyone orders a rack.1

The crucial nuance: queues are not forecasts. Many projects withdraw. But queues still tell you the same thing: demand arrives faster than the system can study, approve, and upgrade itself.

AI inherits that pace. If your deployment depends on new capacity, you are now competing with everything else that wants electrons: factories, EV charging, renewables, storage, and population growth.

Bottleneck 2: transformers & switchgear (the grid’s “lead time tax”)

A data center is a machine that turns electricity into heat. The parts that let it do that at scale are not exotic—they are industrial, standardized, and scarce.

Transformers are the canonical example: boring, huge, and gating. A utility or developer can order one and wait 2 to 4 years for delivery in today’s market, where “months” used to be normal.2 Switchgear and breakers show up in the same story: long lead times, limited capacity, and supply chains that don’t flex on demand.

This is one of the most important mental flips:

  • The constraint is not “can we afford it?”
  • The constraint is “can we get in line early enough?”

Capital can accelerate some things. It cannot compress a multi-year manufacturing queue without building more manufacturing.

Bottleneck 3: metals (rare earths are a decoy)

“Rare earths” are the headline because the phrase sounds like science fiction. But for power infrastructure, the dull materials are often more binding:

  • Copper: conductors, windings, busbars, grounding, and the miles of cabling that turn a site into a system
  • Aluminum: transmission and distribution conductors at scale
  • Steel: towers, rebar, frames, enclosures, buildings, and everything that holds weight

If AI is an industrial build, it competes in the same commodity markets as the rest of the energy transition.

You can’t scale a megawatt-class system without a lot of metal. The bottleneck is rarely one magical element; it’s the aggregate friction of many ordinary ones.

Bottleneck 4: cooling & siting (heat has geography)

Compute density is heat density. You can move heat with air or water, but either way you’re constrained by local realities: climate, water rights, permitting, and what the community will tolerate.

That constraint shows up as policy, not physics equations. Singapore’s 2019 “temporary pause” on new data centers is a clean example: authorities explicitly framed data centers as intensive users of electricity and water, and slowed approvals while they worked out sustainability constraints.3

In Ireland, grid constraints became explicit policy. Regulators note data centers rising from 5% to 21% of national electricity demand (2015 → 2023) and are building connection rules around local constraints and requirements for matching generation/storage.4

The pattern is the point: at scale, heat is political. It gets negotiated.

Training vs inference: two different constraint profiles

It’s tempting to talk about “AI compute” as a single resource. In practice there are at least two different businesses wearing one label:

  • Training is throughput-bound. You want enormous bursts of energy over weeks or months, and you can choose location more freely because latency to end users is irrelevant.
  • Inference is service-bound. You want predictable, continuous power, tight latency, high uptime, and geographic distribution (close to users, close to networks, close to demand).

Efficiency improvements matter in both. But efficiency does not abolish build cycles.

If anything, efficiency changes the shape of the constraint: it might reduce megawatts per unit of output while increasing total demand by making AI cheaper and more widely used. The bottleneck stays physical; it just moves around.

So what are hyperscalers buying?

They’re buying the right to build a heat-producing city:

  • land with a path to power
  • interconnect approvals
  • transformer allocations
  • cooling and water plans that clear regulators
  • construction capacity and a schedule that doesn’t slip
  • operational maturity to keep it running

The GPU line item is a dependency inside a larger dependency chain.

A falsifiable version of the thesis

If you want this to be real, pick a claim you can lose.

Here are three you can actually watch:

  1. Transformer lead times will remain a first-order constraint on new AI capacity through 2028. If lead times return to “months,” this claim weakens fast.2

  2. Interconnect timelines, not model architectures, will dominate deployment schedules for new build capacity. If interconnect queues clear and median time-to-operation collapses, the bottleneck changes.1

  3. Cooling and siting constraints will push inference growth toward “where power is” rather than “where users are,” until network, policy, and product requirements force a re-balance. If inference stays purely metro-centric without friction, this is wrong.

Closing

The cleanest way to think about “scaling AI” is: building cities whose only industry is heat.

The story is not that software got less important. It’s that the limiting factor moved down the stack—into power, metal, and time.

If you want to understand who wins, stop watching model demos. Watch lead times.


Footnotes

  1. Lawrence Berkeley National Laboratory, Queued Up: 2025 Edition (data through end of 2024): ~10,300 projects actively seeking grid interconnection, representing ~1,400 GW generation and ~890 GW storage; median queue duration for built projects in available regions rose to >4 years for 2018–2024 vintages. https://doi.org/10.2172/3008763 2

  2. National Infrastructure Advisory Council (CISA), Addressing the Critical Shortage of Power Transformers (June 2024): utilities/developers may wait 2–4 years for transformer delivery; large transformer lead times cited as 80–210 weeks. https://www.cisa.gov/sites/default/files/2024-09/NIAC_Addressing%20the%20Critical%20Shortage%20of%20Power%20Transformers%20to%20Ensure%20Reliability%20of%20the%20U.S.%20Grid_Report_06112024_508c_pdf_0.pdf 2

  3. Channel NewsAsia (May 10, 2021), on Singapore’s “temporary pause” on new data centers: framed as intensive users of electricity and water, with the decision communicated in 2019. https://www.channelnewsasia.com/business/new-data-centres-singapore-temporary-pause-climate-change-1355246

  4. Ireland’s Commission for Regulation of Utilities (Feb 18, 2025), proposed decision on new electricity connection policy for data centres: notes 5% → 21% of national electricity demand (2015 → 2023) and adds requirements tied to constrained regions and matching generation/storage. https://www.cru.ie/about-us/news/new-electricity-connection-policy-for-data-centre/