Telecom operators are under pressure to integrate AI into live networks, but doing so without compromising reliability or performance remains a major hurdle. AMD is addressing this challenge with a comprehensive approach that merges open software stacks, GPUs, CPUs, and networking solutions designed for distributed telco-grade AI. This strategy aims to support real-time optimization, automation, and resilience across core and edge environments, ensuring AI can be deployed at scale without sacrificing stability.
The centerpiece of AMD's initiative is the EPYC 8005 server CPU, engineered specifically for virtual RAN (vRAN) deployments. These CPUs deliver high compute density while maintaining efficiency in rugged, space-constrained edge sites. Their ability to operate within wide thermal ranges and meet NEBS compliance standards makes them ideal for outdoor and non-traditional locations—a critical requirement for 5G and future RAN implementations.
Open Collaboration: The Foundation of Telco-Specific AI
The shift from traditional RAN to open, virtualized architectures demands more than just hardware; it requires an open ecosystem where operators, vendors, researchers, and developers can collaborate on telco-specific AI models. These models must address unique challenges such as network optimization, anomaly detection, and operational automation.
- AMD is a founding collaborator in Open Telco AI, a global initiative led by the GSMA to accelerate the development of telco-grade AI models. The project introduces an open-telco.ai portal, serving as a shared hub for datasets, tools, benchmarks, and other resources.
- AT&T contributes open-telco models, while AMD provides compute infrastructure using its Instinct GPUs paired with the ROCm software stack. This combination enables rapid iteration from experimentation to validation.
The AMD Enterprise AI Suite acts as the production layer for these models, integrating them into Kubernetes-native workflows that align with telco DevOps and MLOps practices. This suite supports model serving, governance, security controls, and lifecycle management—critical components for deploying AI services at scale in live networks.
Why Reliability Matters in Telco-Grade AI
For telecom operators, the shift to open RAN and AI-driven automation is not just about performance but also about reliability. Traditional generative AI models often lack the robustness required for mission-critical network operations. AMD's approach addresses this by combining open collaboration with hardware optimized for edge deployments.
A practical example of this in action would be a vRAN site where EPYC 8005 CPUs handle Layer 1 processing with deterministic performance, while the Enterprise AI Suite ensures that AI-driven automation is governed, observable, and repeatable. This dual-layer strategy minimizes latency and maximizes uptime—factors that directly impact network quality.
The Path Forward: Scaling Telco-Grade AI
AMD's next steps will focus on demonstrating the scalability of this end-to-end solution across multiple operators and use cases. Key areas to monitor include the adoption of open-telco.ai as a standard hub for telco AI development, the integration of EPYC 8005 in more vRAN deployments, and the evolution of the Enterprise AI Suite to support emerging edge workloads.
Ultimately, AMD's move underscores a fundamental shift: telco-grade AI is no longer an experimental concept but a necessity for operators transitioning to open architectures. By providing both the hardware foundation and the software framework, AMD positions itself as a critical enabler of this transformation, ensuring that AI can be deployed reliably and efficiently in live telecom networks.
