Microsoft Just Dropped 3 New AI Models to Take on OpenAI and Google

Praveen Kumar
1w
402
0
2

News

Redmond, WA — Microsoft has unveiled three new in-house AI models under its Microsoft AI (MAI) division, signaling a major push to compete directly with rivals like OpenAI and Google. The new models—MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2—are now available through Microsoft Foundry and the MAI Playground.

The launch marks a shift in Microsoft’s strategy—from relying heavily on partner models to building its own full-stack AI ecosystem.

Three Models, Three Core AI Capabilities

Microsoft’s latest release targets three of the most commercially valuable AI domains:

🎙️ MAI-Transcribe-1 (Speech-to-Text)

Supports 25 major languages
Delivers state-of-the-art accuracy in noisy, real-world environments
Up to 2.5× faster transcription speeds than previous Microsoft offerings

👉 Built for use cases like meetings, call centers, subtitles, and voice agents.

🔊 MAI-Voice-1 (Text-to-Speech)

Generates natural, expressive speech with emotional nuance
Can create custom voices from just seconds of audio
Produces 60 seconds of audio in ~1 second

👉 Designed for voice assistants, audiobooks, and conversational AI.

🖼️ MAI-Image-2 (Image Generation)

Delivers 2× faster image generation with high visual quality
Optimized for realistic lighting, textures, and text rendering
Already ranking among top image models on benchmarks

👉 Targeted at designers, marketers, and enterprise creative teams.

Built for Developers, Priced for Scale

Microsoft is emphasizing price-performance leadership with aggressive pricing:

Transcription starting at $0.36/hour
Voice generation at $22 per 1M characters
Image generation starting at $5 per 1M tokens

👉 The goal: make high-end AI models accessible at scale for real-world production.

Available Now in Microsoft Foundry

All three models are integrated into Microsoft Foundry, the company’s AI platform for building and deploying applications.

Developers can:

Access models via APIs
Build multimodal applications (voice + text + image)
Deploy AI agents with built-in governance and security

Microsoft is also rolling these models into its own products like Copilot, Teams, Bing, and PowerPoint, signaling rapid real-world adoption.

A Direct Challenge to AI Rivals

This launch clearly positions Microsoft against:

OpenAI (GPT, Codex)
Google (Gemini, Veo, Gemma)
Anthropic (Claude)

Industry reports confirm this is part of Microsoft’s broader effort to reduce reliance on external AI providers and build proprietary models.

👉 In short: Microsoft is no longer just a platform for AI—it’s becoming a model creator at scale.

Microsoft’s MAI models reflect a growing trend:

👉 Tech giants are building end-to-end AI stacks—models, infrastructure, and applications

We’re seeing:

Google → Gemini + Gemma + Veo
OpenAI → GPT + Agents + Codex
Microsoft → MAI + Copilot + Foundry

The competition is shifting from who has the best model → to who owns the full AI ecosystem.

Source: Microsoft