![Rubin CPX]()
September 9, 2025 – NVIDIA has unveiled its latest innovation in AI hardware, the Rubin CPX GPU, a purpose-built processor designed to revolutionize massive-context inference tasks in AI applications such as coding assistants and generative video.
What is Rubin CPX?
The NVIDIA Rubin CPX is a next-generation GPU designed to process up to a million tokens at once, enabling sophisticated AI models to handle large-scale software development and generative video workflows with incredible speed and efficiency. Unlike traditional GPUs, Rubin CPX integrates video decoders, encoders, and long-context inference directly into a single chip, offering a unified solution for heavy AI workloads.
Unmatched Performance
Powered by the new NVIDIA Vera Rubin NVL144 CPX platform, the Rubin CPX delivers:
8 exaflops of AI compute power
100TB of high-speed memory
1.7 petabytes/sec of memory bandwidth
Up to 30 petaflops compute with NVFP4 precision
3× faster attention mechanisms compared to previous GPUs
This combination enables developers to build AI systems capable of reasoning across millions of tokens of code or video data without performance drops.
Industry Adoption
Leading AI innovators are already integrating Rubin CPX to supercharge their products:
Cursor, the advanced AI-powered code editor, leverages Rubin CPX to deliver lightning-fast code generation and real-time developer insights.
Runway uses Rubin CPX to enable creators to generate cinematic video content and high-quality visual effects at scale.
Magic is developing autonomous software engineering agents powered by Rubin CPX, capable of understanding entire codebases and documentation in real time.
Monetization Potential
With Rubin CPX, NVIDIA claims companies can achieve $5 billion in token revenue for every $100 million invested, thanks to the unprecedented scale and performance of long-context processing. This promises new business opportunities for enterprises focused on AI-driven solutions.
Software Ecosystem
Rubin CPX is fully supported by NVIDIA’s AI stack:
NVIDIA Dynamo platform for efficient scaling of inference workloads
NVIDIA Nemotron multimodal models for state-of-the-art reasoning
Integration with NVIDIA AI Enterprise software suite
Support for 6,000 CUDA applications and over 6 million developers
Availability
The NVIDIA Rubin CPX is expected to be available by end of 2026, giving developers and enterprises time to prepare for next-generation AI workloads.