A new Morgan Stanley research report reveals that Google is preparing for a massive expansion in Tensor Processing Unit (TPU) production. Supply chain checks suggest that prior uncertainties surrounding TPU availability have largely been resolved—signaling Google’s readiness to push TPU chips aggressively into the market.
This marks Google’s most serious attempt to challenge NVIDIA’s near-monopoly on AI compute.
🚀 Production Surge and Commercial Strategy #
Morgan Stanley has sharply raised its TPU production forecasts:
| Year | Previous Forecast | New Forecast | Increase |
|---|---|---|---|
| 2027 | 3M units | 5M units | +67% |
| 2028 | 3.2M units | 7M units | +120% |
| Total (2027–2028) | 6.2M | 12M | — |
To compare, Google has produced 7.9 million TPUs in the last four years combined. The new forecast represents a major strategic shift—doubling production in just two years.
Financial Impact #
- Every 500,000 TPUs sold could add $13B in revenue.
- Impact on Google’s 2027 EPS: +$0.40 per share.
Strategic Direction #
Google intends to:
- Sell TPUs to third-party data centers, not just its own operations.
- Integrate TPU sales as a major driver of Google Cloud Platform (GCP) growth.
- Build a scalable alternative to NVIDIA for customers prioritizing cost-efficient AI inference.
Driven by skyrocketing demand for AI compute, Google is preparing for the first true commercial push of TPU hardware.
🧠 Training vs Inference: The Real Battle of the AI Era #
To understand why TPU expansion matters, we must differentiate between the two pillars of AI compute: Training and Inference.
1. Training: NVIDIA’s Historical Stronghold #
- Involves processing huge datasets to teach a model patterns.
- NVIDIA dominates this space with H100 and the CUDA ecosystem.
- Example: Training GPT-4 reportedly cost $150M—a one-time expenditure.
2. Inference: The Exploding Cost Center #
- Inference is every query, every image generation, every recommendation.
- It is continuous, unbounded, and scales with model adoption.
Key projections:
- 75% of all AI compute will be used for inference by 2030.
- Inference will become a $255B market.
- OpenAI’s inference spending in 2024 alone reached ~$2.3B—15× the training cost of GPT-4.
NVIDIA’s weakness?
- GPUs are overgeneralized.
- Many of their functions aren’t needed for sustained, high-volume inference.
- This leads to wasted power, wasted silicon, and higher costs.
🥇 TPU: Built for the Inference Era #
Google’s TPU is an ASIC (Application-Specific Integrated Circuit) designed purely for tensor operations—providing massive gains in cost efficiency and scalability.
Architectural Comparison #
| Feature | TPU (ASIC) | NVIDIA GPU | Inference Winner |
|---|---|---|---|
| Design | Tensor-specific | General-purpose | TPU |
| Data Flow | Systolic array | Cache hierarchy | TPU |
| Instruction Overhead | Minimal | Significant | TPU |
| Efficiency | 60–65% lower power in search workloads | Higher idle/overhead | TPU |
| Price-Performance | Up to 4× H100 | High cost | TPU |
| Scaling | Near-linear via TPU pods | PCIe & memory constraints | TPU |
Cost Advantage #
- TPU v6e: $1.375/hour on-demand
- As low as $0.55/hour with commitments
- No software licensing fees
This contrasts with NVIDIA’s increasingly burdensome licensing model and massive GPU costs.
🌐 Real-World Adoption: Industry Moves Toward TPUs #
Several major AI operators have already made the shift:
Midjourney #
- Switched to TPUs in 2024
- Inference cost reduced 65%
- Monthly AI compute spend fell from $2M → $700k
Anthropic #
- Multi-billion-dollar agreement with Google
- Up to 1 million TPUs deployed by 2026
- Cited “superior price-performance” as key reason
Meta #
- Pursuing a hybrid TPU+GPU deployment strategy
- Evaluating billions worth of TPUs starting 2026
- Uses NVIDIA for training flexibility, TPUs for inference scale
Collectively, these moves confirm a market transition: the world is optimizing for inference, and TPUs are built for that world.
📉 The Threat to NVIDIA’s Valuation #
NVIDIA’s premium valuation is built on two pillars:
- Massive GPU sales
- Exceptionally high 70–80% gross margins
TPUs threaten both.
- TPU’s 4× price-performance advantage will pressure NVIDIA’s margins.
- Large customers are already shifting workloads away from GPUs.
- Hedge funds linked to Peter Thiel and SoftBank have sold $6B+ of NVIDIA stock, anticipating competitive and structural headwinds.
In the emerging AI infrastructure landscape, the winning model is increasingly clear:
- NVIDIA GPUs → Best for model training and research
- Google TPUs → Best for scaled inference and production workloads
The future is hybrid, but Google’s aggressive TPU expansion suggests that the largest growth opportunity—inference—may soon be dominated by TPU architectures rather than GPUs.