January 14, 20267 min readUpdated May 19, 2026

What is a Neocloud?

The category of GPU-native AI clouds — CoreWeave, Lambda, Crusoe, Nebius — explained.

If you've read a SemiAnalysis report, a Morgan Stanley AI infrastructure note, or an Information piece on CoreWeave in the last year, you've seen the term. "Neocloud" is the label analysts have settled on for the new generation of GPU-native cloud providers — and it's now used the same way "hyperscaler" was a decade ago.

The one-sentence definition

A neocloud is a cloud provider whose entire stack — silicon procurement, data-center design, network fabric, scheduler, and pricing model — is purpose-built for GPU compute, AI training, and inference.

That contrasts with a hyperscaler (AWS, Azure, GCP), where GPU instances are one bolt-on product line on top of a general-purpose cloud designed in the 2000s for web apps, databases, and object storage.

The four traits that make a cloud a neocloud

1GPU-first capacity planning. Neoclouds buy H100/H200/B200/GB200 in 5,000–100,000-unit blocks directly from NVIDIA, often 12–18 months ahead.
2AI-shaped network fabric. 3.2 Tbps non-blocking InfiniBand or NVLink fabrics tuned for all-reduce, not for object storage replication.
3Single-tenant power density. 80–130 kW per rack, vs. ~12 kW typical for a generalist DC — only possible with direct-to-chip liquid cooling.
4Per-GPU-hour billing. No abstraction layer of "vCPU + RAM + EBS" — you rent an H100 by the hour, the day, or the reserved year.

Where the term came from

Dylan Patel of SemiAnalysis is widely credited with popularising "neocloud" in 2023, when CoreWeave's revenue ramp made it impossible to keep calling the category "GPU rental." Morgan Stanley's AI infra team picked the term up in early 2024, and by mid-2024 Goldman Sachs, JPMorgan, and Bernstein had all standardised on it in research notes.

The 2025 CoreWeave IPO crystallised it. The S-1 didn't use the word, but every analyst note that priced the deal did. "Neocloud" became the bucket the public market used to value the company.

Who counts as a neocloud

Drawing the line is contested, but most analyst lists include some subset of the following:

Company	HQ	Status	Notes
CoreWeave	Roseland, NJ	Public (CRWV)	First neocloud IPO, March 2025
Nebius	Amsterdam	Public (NBIS)	Spun out of Yandex N.V.
Lambda Labs	San Francisco	Late-stage private	$1.5B Series C, 2025
Crusoe Energy	Denver	Late-stage private	Flared-gas-powered DCs
Together AI	San Francisco	Late-stage private	Inference-leaning
Fluidstack	London	Private	European focus
Vast.ai	Phoenix	Private	Marketplace model
RunPod	Remote	Private	Developer-first
Paperspace	Brooklyn	Acquired by DigitalOcean	Now PSPC line
Voltage Park	San Francisco	Private / non-profit	10K H100s, research-focused

Why neoclouds exist at all

Three things broke the hyperscaler model for AI workloads at once:

Allocation. NVIDIA prioritises strategic customers; neoclouds got allocations hyperscalers couldn't, because they committed earlier and more narrowly.
Networking. Standard cloud networking can't sustain the all-reduce patterns of distributed training without paying a 30–50% effective tax in step time.
Pricing. Hyperscaler GPU instances are bundled with margins on EBS, NAT gateways, egress, and idle vCPU. Neoclouds strip that out — and on a per-token-trained basis they are typically 40–70% cheaper.

How big is the category?

Neocloud revenue went from a rounding error in 2022 to an estimated $24–28B in 2025, with consensus models projecting $80–120B by 2027 and $240B+ by 2030. Most of that growth comes from inference, not training — which is why the next wave of neoclouds is going to be regional and latency-optimised, not just hyperscale.

Are neoclouds going to last?

There are two ways the bear case goes. One: hyperscalers catch up on networking and dump capacity. Two: GPU prices collapse and the rental margin disappears. Both are real. But there's a structural reason the bear case has been wrong for three years running — every generation of frontier model has needed more compute than the last, and every generation of GPU silicon has needed more bespoke deployment than the last. That's a fundamental fit with the neocloud thesis.