Data centers are evolving beyond traditional compute models. Qualcomm Technologies is now positioning itself as a key player in this transformation with a new portfolio of solutions aimed at agentic AI workloads.

The company’s Dragonfly platform, which includes the C1000 CPU and High Bandwidth Compute (HBC) architecture, represents a significant shift toward disaggregated, rack-scale AI inference. This approach is designed to address memory bandwidth bottlenecks while improving token economics—a critical factor for hyperscalers managing large-scale AI deployments.

Performance and Efficiency at Scale

The Qualcomm Dragonfly C1000 CPU is built specifically for data center workloads, featuring custom-designed Oryon cores optimized for performance and power efficiency. It supports frequencies exceeding 5 GHz and a multi-chiplet architecture that scales from general-purpose to AI-specific tasks. Benchmarks suggest it delivers over twice the performance per watt compared to existing server CPUs.

Memory bandwidth is a key innovation in this platform. The HBC architecture, which bonds compute with accelerated memory, enables industry-leading memory capacity and effective bandwidth. For example, the AI250 with HBC Gen 1 achieves 133 TB/s per card, an 18x improvement over its predecessor. Future generations are expected to push these metrics even further, with HBC Gen 2 targeting a 54x increase in bandwidth.

  • C1000 CPU: Custom Oryon cores, >5 GHz frequencies, multi-chiplet design for scalability
  • HBC Architecture: 3D-stacked silicon for near-memory compute, 133 TB/s bandwidth (Gen 1)
  • AI300: Rack-level inference platform with HBC Gen 2, expected to deliver 4x-8x better performance-per-watt than GPUs

A Platform Built for Hyperscalers

The Dragonfly portfolio is designed to integrate seamlessly into existing data center infrastructure. It supports both air and liquid cooling, OCP-compliant racks, and advanced connectivity options like PCIe Gen 7 and CXL. This modular approach allows for flexible deployment, whether for agentic AI orchestration or generative model inference.

Qualcomm's Dragonfly Platform: A Shift in Data Center AI

Qualcomm’s focus on total cost of ownership (TCO) is evident in its claims of reducing CapEx and OpEx while maintaining leadership performance. The platform also includes advanced reliability features such as ECC, fault isolation, and error recovery to ensure resilient operation at scale.

Ecosystem and Commercialization

Qualcomm has secured strategic partnerships that underscore the platform’s potential. A multi-year, multi-generation agreement with Meta will see the Dragonfly C1000 power Meta’s next-gen server fleet, starting in 2028. Additionally, over 35 global leaders from technology and AI ecosystems have expressed support for Qualcomm’s vision.

Commercial availability is expected to unfold in phases: HBC Gen 1 with the AI250 is slated for mid-2027 sampling, while the full AI300 rack-level platform will follow in 2028. This roadmap aligns with Qualcomm’s commitment to an annual cadence of innovation, focusing on advancing AI inference performance and energy efficiency.

What It Means for Developers

For developers, the Dragonfly platform introduces a new paradigm in data center AI workloads. The disaggregated architecture allows for more efficient memory usage and lower latency, which are critical for agent-driven applications. However, the full impact on existing GPU-based workflows remains to be seen.

The platform’s emphasis on near-memory computing with HBC could redefine how AI models are deployed, potentially reducing reliance on high-bandwidth memory (HBM) solutions. If successful, this shift could influence upgrade decisions for developers prioritizing performance-per-watt and scalability over traditional GPU-centric approaches.