Overview
- Big cloud companies including Google are using large financing deals and direct sales of their tensor processing units to guarantee demand for custom ASICs and speed adoption of inference‑focused hardware.
- Nvidia remains the dominant GPU supplier with a large software ecosystem called CUDA that keeps many customers tied to its stack, but short‑term signs show pricing pressure after B200 rental rates declined from a May peak.
- Chipmakers Broadcom and Marvell are growing fast by designing custom ASICs for major hyperscalers and AI firms, creating a multi‑front competition that targets inference workloads where cost per operation matters most.
- Analysts warn that power, memory, and optical networking are the main bottlenecks for AI scale and that hundreds of billions in infrastructure capex will follow, creating new opportunities for energy, memory, and networking suppliers.
- The market shift from episodic model training to continuous inference changes who wins and who pays: hyperscalers aim to cut operating costs for constant workloads while investors and data‑center firms reposition around long‑term supply, financing, and capacity needs.