Apple's AI server overhaul: From idle clusters to Baltra ASIC

Apple is accelerating plans to replace its underutilized AI servers with a new Broadcom-designed ASIC, codenamed Baltra, to regain control of Siri and other AI features. The project, built on TSMC's 3nm N3E process, aims to eliminate reliance on Google's infrastructure by 2027 or 2028.

Read

02 Mar 2026, 10:56 PM • 227 words • 2 min • ~2 min left

Key takeaways

Apple is moving fast to fix a critical flaw in its AI strategy: wasted server capacity.Nearly all of Apple’s Private Clo...
The upcoming Baltra ASIC, developed with Broadcom using TSMC’s 3nm N3E process, could change that

Apple is moving fast to fix a critical flaw in its AI strategy: wasted server capacity.

Nearly all of Apple’s Private Cloud Compute hardware—90 percent—has been sitting idle, leading to cost overruns and forcing the company to depend on Google for Siri and other AI workloads. The upcoming Baltra ASIC, developed with Broadcom using TSMC’s 3nm N3E process, could change that by centralizing Apple’s AI infrastructure.

Why it matters: A unified platform would slash duplicate spending and speed up feature rollouts. Right now, different teams use incompatible stacks, creating inefficiencies that drag down performance.

The Baltra ASIC is expected to enter mass production in 2027–2028, potentially matching or exceeding Google’s TPU efficiency. If it succeeds, Apple could phase out external dependencies by the late 2020s.

Apple's AI server overhaul: From idle clusters to Baltra ASIC

Baltra may use a modular chiplet design with Broadcom handling inter-chip communication.
Foxconn and Lenovo are involved in server assembly, but the final architecture remains undisclosed.

The revamped Siri will also introduce agentic actions—context-aware tasks spanning apps—in iOS 27, powered by a 1.2-trillion-parameter Gemini variant codenamed Foundation Models v10. Its performance hinges on Apple scaling its own cloud infrastructure.

What’s next

The Baltra ASIC is Apple’s best shot at self-sufficiency before current AI servers become obsolete. If it delivers, the company could break free from Google’s TPU-based Gemini model while maintaining strict privacy controls.

Category:

AI Gaming GPU Laptops Mobile PC