![Nvidia]()
Google Cloud and NVIDIA have been working together for years, teaming up to push the boundaries of AI. Their collaboration goes far beyond just infrastructure they’re working closely at the engineering level to fine-tune how AI models are built and run.
Helping Developers Build Faster and Smarter
The two companies are contributing to major open-source AI tools like JAX, OpenXLA, MaxText, and llm-d, all of which are helping speed up AI model development and deployment. These tools are especially important for running Google’s powerful Gemini models and the lightweight Gemma open models.
NVIDIA’s performance-optimized AI software, including NeMo, TensorRT-LLM, Dynamo, and NIM microservices, is also deeply integrated across Google Cloud services like Vertex AI, Google Kubernetes Engine (GKE), and Cloud Run. This makes it easier and faster for companies to build and launch AI applications.
NVIDIA’s Blackwell GPUs Now Powering Google Cloud
Google Cloud is the first cloud provider to launch virtual machines (VMs) powered by NVIDIA’s new Blackwell chips, including the HGX B200 and GB200 NVL72. These are available as A4 and A4X VMs, and they deliver massive computing power over one exaflop per rack in the A4X, enough for even the largest AI models.
Thanks to Google’s Jupiter networking and advanced cooling systems, these VMs can scale up easily and run efficiently, even under heavy workloads. You can access them through managed services like Vertex AI and GKE, making it easier for teams to launch and scale agent-based AI applications.
Gemini AI Can Now Run On-Premises with Google Distributed Cloud
Until now, some companies like those in government, healthcare, or finance haven’t been able to use cloud-based Gemini models because of strict rules around data privacy and security.
That’s changing.
With NVIDIA Blackwell chips now supported in Google Distributed Cloud, organizations can run Gemini models on-premises in their own secure data centers. That means they get all the power of Gemini AI while keeping their data private and meeting regulatory requirements.
Faster, Smarter AI Performance for Everyone
The Gemini family of models is Google’s most advanced yet, capable of reasoning, coding, and understanding multiple types of data (text, images, etc.).
To make sure these models run as efficiently as possible, NVIDIA and Google have optimized them to work with NVIDIA GPUs. Whether you’re using Vertex AI or Google Distributed Cloud, Gemini models now respond faster and handle more complex tasks than ever before.
Meanwhile, the Gemma models, which are lighter and open-source, have also been fine-tuned with TensorRT-LLM and will be available as NIM microservices, making it even easier for developers to use them in different environments, from large data centers to personal workstations.
Supporting the AI Developer Community
It’s not just about hardware and software NVIDIA and Google Cloud are also investing in the developer community. They’re improving tools like JAX to scale more efficiently across thousands of GPUs and launching a joint developer community to help more people learn, share, and build with AI.
Together, they’re making powerful AI tools more accessible and helping more developers around the world bring their ideas to life.