NVIDIA and Nebius have announced a long-term collaboration that will scale one of the largest AI clouds globally. The partnership focuses on building out full-stack AI infrastructure, from silicon to software, with a target of deploying over five gigawatts of NVIDIA systems by the end of 2030.
The investment, which includes $2 billion from NVIDIA, reflects confidence in Nebius's engineering depth and its ability to meet growing demand for high-performance compute. The collaboration will prioritize AI factory architecture, inference stacks, and deployment of multiple generations of NVIDIA hardware, including the Rubin platform, Vera CPUs, and BlueField storage systems.
Key elements of the partnership include
- AI factory design and support, with access to early samples, system software, and technical reviews.
- Development of an optimized inference and agentic AI stack for developers and enterprises.
- Fleet management using NVIDIA's GPU health monitoring and software recommendations.
The partnership aims to address the surging demand for AI infrastructure driven by advancements in agentic AI. Both companies position this as a step toward building a cloud platform tailored specifically for AI workloads, rather than adapting existing general-purpose architectures.
Nebius has been developing its platform with an AI-first approach since its inception, and the collaboration with NVIDIA will extend this throughout the technology stack. This includes scaling from gigawatt-level data centers to advanced software layers, ensuring compatibility with next-generation AI development needs.
The expansion is expected to support a wide range of AI builders, from startups to enterprises, by providing a unified platform for training, deployment, and production of AI models and services. The focus on agentic AI and high-performance inference will be central to the partnership's roadmap, with continuous updates to hardware and software to maintain performance leadership.
Industry observers note that this collaboration could redefine cloud infrastructure for AI workloads, emphasizing integration from silicon to software rather than treating them as separate components. The target of five gigawatts by 2030 underscores the aggressive scale required to meet the projected growth in AI demand, particularly in areas like generative AI and autonomous systems.
The partnership also highlights NVIDIA's role in shaping the next generation of cloud platforms, moving beyond traditional GPU-accelerated computing to a more holistic approach that includes CPU, storage, and system-level optimizations. This shift reflects broader trends in the industry, where AI workloads require deeper integration across hardware tiers.
For end users, the practical impact will be faster access to advanced AI capabilities, lower latency for inference tasks, and greater flexibility in deploying custom models at scale. The collaboration also aims to reduce fragmentation in the AI ecosystem by providing a consistent software stack across multiple hardware generations.
The most significant change introduced by this partnership is the commitment to a full-stack AI cloud built from the ground up, rather than retrofitting existing infrastructure. This approach could set a new standard for how AI clouds are designed and deployed, with potential ripple effects across the industry.