Neoclouds vs Hyperscalers
Side-by-side on price, networking, allocation, and AI-specific tooling.
Pull any 2025 model-card disclosure where the lab actually names where the training run happened, and you'll see the same pattern — frontier work increasingly runs on neoclouds, while hyperscalers get the inference traffic and the enterprise compliance workloads. Here's why.
The headline numbers
| Metric | Hyperscaler avg. | Neocloud avg. | Spread |
|---|---|---|---|
| H100 on-demand $/hr | $5.40 | $1.92 | −64% |
| H100 1-yr reserved $/hr | $3.10 | $1.18 | −62% |
| Egress per TB to internet | $45–90 | $0–8 | ≈ free |
| Inter-node fabric BW | 200–400 Gbps EFA | 3.2 Tbps IB | 8–16× |
| Provision time (1K GPUs) | 8–14 weeks queue | 1–4 weeks | −70% |
Note that this isn't apples-to-apples — hyperscalers bundle services, neoclouds don't. But for the specific workload of "train a large model on N thousand GPUs and ship the weights to me," the gap is real and well-documented.
Why the gap exists
1. Bill-of-materials, not opex.
Hyperscalers depreciate GPUs over 5 years and price the rental on a return-on-invested-capital model that has to clear their corporate hurdle rate (~15–20%). Neoclouds frequently depreciate over 4 years and target much lower returns, because their alternative is not deploying the capital at all.
2. Networking architecture.
An AWS p5 cluster uses EFA (Elastic Fabric Adapter) — fast for cloud, slow for AI. A CoreWeave cluster uses non-blocking InfiniBand at 3.2 Tbps per node. On a 1024-GPU all-reduce, the IB cluster spends 8–12% of step time in communication. The EFA cluster spends 30–45%. That difference compounds into massive effective-cost gaps over a multi-week training run.
3. No bundled overhead.
When a hyperscaler charges $5.40/hr for an H100, you also pay for the EBS volume, the NAT gateway, the load balancer, the egress to S3. Net it out and you're often 2.5–3× the headline. Neoclouds charge $1.92/hr with NVMe local, free egress, free L7 ingress.
4. Allocation politics.
This one matters more than people admit. NVIDIA's allocation prioritises customers who commit early, narrowly, and to a single architecture. Neoclouds do exactly that — hyperscalers are constantly diversifying across NVIDIA, AMD MI300, their own silicon (Trainium, TPU), which dilutes their NVIDIA allocation. The result: a neocloud will get an H200 capacity drop six to twelve months before a hyperscaler.
Where hyperscalers still win
- Compliance — FedRAMP High, SOC 2 Type II at scale, IRAP, public-sector regions.
- Inference at the edge — Cloudflare AI, Vercel, AWS Lambda@Edge fit much better for low-latency serving.
- Mixed workloads — if you need GPUs and a Postgres and a CDN and Kinesis, the hyperscaler bundle still wins.
- Existing enterprise contracts — committed-spend agreements often eat hundreds of millions of dollars that have to be spent inside the existing cloud.
What this means for buyers
If you're training a model: use a neocloud, by default. If you're serving inference to a global B2C product: use a hyperscaler for the edge layer and a neocloud for the heavy backend. If you're an enterprise CTO with $300M of committed Azure spend: negotiate Azure GPU rates with a neocloud quote in your other hand. They will be sharpened.
The convergence trade
Both sides are converging. Hyperscalers are launching neocloud-style products (Azure ND-H100v5 is the most credible). Neoclouds are launching hyperscaler-style services (CoreWeave's Kubernetes-as-a-Service, Crusoe's managed inference). The interesting question for 2027 is whether the categories merge or whether "neocloud" becomes a permanent peer to "hyperscaler" — like "investment bank" and "hedge fund." Our bet is the latter.