What is a Neocloud?
The category of GPU-native AI clouds — CoreWeave, Lambda, Crusoe, Nebius — explained.
If you've read a SemiAnalysis report, a Morgan Stanley AI infrastructure note, or an Information piece on CoreWeave in the last year, you've seen the term. "Neocloud" is the label analysts have settled on for the new generation of GPU-native cloud providers — and it's now used the same way "hyperscaler" was a decade ago.
The one-sentence definition
A neocloud is a cloud provider whose entire stack — silicon procurement, data-center design, network fabric, scheduler, and pricing model — is purpose-built for GPU compute, AI training, and inference.
That contrasts with a hyperscaler (AWS, Azure, GCP), where GPU instances are one bolt-on product line on top of a general-purpose cloud designed in the 2000s for web apps, databases, and object storage.
The four traits that make a cloud a neocloud
- 1GPU-first capacity planning. Neoclouds buy H100/H200/B200/GB200 in 5,000–100,000-unit blocks directly from NVIDIA, often 12–18 months ahead.
- 2AI-shaped network fabric. 3.2 Tbps non-blocking InfiniBand or NVLink fabrics tuned for all-reduce, not for object storage replication.
- 3Single-tenant power density. 80–130 kW per rack, vs. ~12 kW typical for a generalist DC — only possible with direct-to-chip liquid cooling.
- 4Per-GPU-hour billing. No abstraction layer of "vCPU + RAM + EBS" — you rent an H100 by the hour, the day, or the reserved year.
Where the term came from
Dylan Patel of SemiAnalysis is widely credited with popularising "neocloud" in 2023, when CoreWeave's revenue ramp made it impossible to keep calling the category "GPU rental." Morgan Stanley's AI infra team picked the term up in early 2024, and by mid-2024 Goldman Sachs, JPMorgan, and Bernstein had all standardised on it in research notes.
The 2025 CoreWeave IPO crystallised it. The S-1 didn't use the word, but every analyst note that priced the deal did. "Neocloud" became the bucket the public market used to value the company.
Who counts as a neocloud
Drawing the line is contested, but most analyst lists include some subset of the following:
| Company | HQ | Status | Notes |
|---|---|---|---|
| CoreWeave | Roseland, NJ | Public (CRWV) | First neocloud IPO, March 2025 |
| Nebius | Amsterdam | Public (NBIS) | Spun out of Yandex N.V. |
| Lambda Labs | San Francisco | Late-stage private | $1.5B Series C, 2025 |
| Crusoe Energy | Denver | Late-stage private | Flared-gas-powered DCs |
| Together AI | San Francisco | Late-stage private | Inference-leaning |
| Fluidstack | London | Private | European focus |
| Vast.ai | Phoenix | Private | Marketplace model |
| RunPod | Remote | Private | Developer-first |
| Paperspace | Brooklyn | Acquired by DigitalOcean | Now PSPC line |
| Voltage Park | San Francisco | Private / non-profit | 10K H100s, research-focused |
Why neoclouds exist at all
Three things broke the hyperscaler model for AI workloads at once:
- Allocation. NVIDIA prioritises strategic customers; neoclouds got allocations hyperscalers couldn't, because they committed earlier and more narrowly.
- Networking. Standard cloud networking can't sustain the all-reduce patterns of distributed training without paying a 30–50% effective tax in step time.
- Pricing. Hyperscaler GPU instances are bundled with margins on EBS, NAT gateways, egress, and idle vCPU. Neoclouds strip that out — and on a per-token-trained basis they are typically 40–70% cheaper.
How big is the category?
Neocloud revenue went from a rounding error in 2022 to an estimated $24–28B in 2025, with consensus models projecting $80–120B by 2027 and $240B+ by 2030. Most of that growth comes from inference, not training — which is why the next wave of neoclouds is going to be regional and latency-optimised, not just hyperscale.
Are neoclouds going to last?
There are two ways the bear case goes. One: hyperscalers catch up on networking and dump capacity. Two: GPU prices collapse and the rental margin disappears. Both are real. But there's a structural reason the bear case has been wrong for three years running — every generation of frontier model has needed more compute than the last, and every generation of GPU silicon has needed more bespoke deployment than the last. That's a fundamental fit with the neocloud thesis.
Further reading
If you want a deeper dive: see our breakdown of the top neocloud companies, our piece on how neoclouds are actually beating hyperscalers on unit economics, and our analysis of where H100 hourly pricing is going next.