Google Expands Gemini Batch API with Embeddings and OpenAI SDK Compatibility

News

Image Courtesy: Google

Google has announced a major update to its Gemini Batch API, introducing support for Embeddings and OpenAI SDK compatibility.

The Gemini Batch API, originally launched to provide asynchronous processing at 50% lower costs for high-volume and latency-tolerant use cases, now supports the newly released Gemini Embedding Model. Developers can process embeddings at scale with significantly higher rate limits and reduced pricing—$0.075 per 1M input tokens—making it ideal for cost-sensitive or asynchronous applications.

Embeddings at Scale

With this update, developers can now run embedding jobs through the Batch API with just a few lines of code. This allows for large-scale tasks such as semantic search, clustering, or recommendation systems to be executed more affordably and efficiently.

OpenAI SDK Compatibility

In addition, Google has introduced OpenAI SDK compatibility for the Batch API. Developers already familiar with the OpenAI client can now switch to Gemini’s Batch API by updating just a few lines of code, streamlining migration and integration.

This compatibility layer lets developers:

Upload batch input files in OpenAI format
Create and monitor batch jobs
Retrieve completed outputs seamlessly

What This Means for Developers

The move underscores Google’s push to make Gemini API more accessible, cost-efficient, and developer-friendly. By bridging compatibility with the OpenAI SDK, Google lowers the barrier for teams already building with OpenAI tools but looking for cost-effective, high-volume alternatives.

Resources:

Google has confirmed that more batch features are in the pipeline, further optimizing performance and pricing for Gemini users.