Mistral Unveils Mistral 3, a New Generation of Open Models Scaling From Edge to Frontier AI

Praveen Kumar
1d
445
0
1

News

Credit: Mistral

Today, Mistral announced Mistral 3, a major upgrade to its open model ecosystem featuring three new small dense models (3B, 8B, 14B) and Mistral Large 3, a 41B-active, 675B-total-parameter Mixture-of-Experts model—the company’s most capable release yet. All models are available under the Apache 2.0 license, reinforcing Mistral’s commitment to open, customizable AI for developers and enterprises.

Mistral Large 3: A Frontier-Class Open Model

Trained on 3,000 NVIDIA H200 GPUs, Mistral Large 3 reaches parity with leading instruction-tuned open-weight models and introduces:

Sparse MoE architecture: 41B active parameters, 675B total
Multimodal understanding, including image comprehension
Best-in-class multilingual performance beyond English and Chinese
Full open-weight release in both base and instruct variants

Large 3 debuts as #2 among OSS non-reasoning models on the LMArena leaderboard and will soon be joined by a dedicated reasoning version.

To support efficient deployment, Mistral is releasing an NVFP4 checkpoint (via llm-compressor) optimized for vLLM, enabling the model to run on Blackwell NVL72, or even a single node with 8×A100/H100 GPUs.

Accelerated Through Deep Collaboration With NVIDIA, vLLM & Red Hat

Mistral 3 models were trained fully on NVIDIA Hopper GPUs, leveraging HBM3e memory for frontier-scale training. NVIDIA’s extensive co-design approach enabled:

Optimized inference with TensorRT-LLM and SGLang
Custom kernels for Blackwell attention and MoE
Support for disaggregated prefill/decode and speculative decoding
Scalable deployments from data center to edge via DGX Spark, RTX GPUs, laptops, and Jetson

Red Hat and vLLM partnered with Mistral to improve accessibility and open-source deployment through optimized serving stacks and community-ready tooling.

Ministral 3: State-of-the-Art Intelligence at the Edge

The new Ministral family (3B, 8B, 14B) includes base, instruct, and reasoning variants—each with native multimodal capabilities. Designed for performance, efficiency, and edge deployment, Ministral 3 models offer:

Best cost-to-performance ratio in OSS
Substantial reduction in generated tokens for real-world tasks
Reasoning variants capable of 85% on AIME 2025 (14B model)

This makes the Ministral series ideal for laptops, mobile devices, robotics, and local/private workloads.

Available Everywhere Developers Build

Mistral 3 launches today on:

Mistral AI Studio
Amazon Bedrock
Azure Foundry
Hugging Face (Large 3 + Ministral)
IBM WatsonX
Modal
OpenRouter
Unsloth AI
Together AI

Coming soon: NVIDIA NIM and AWS SageMaker.

Enterprise Customization With Mistral AI

Organizations can collaborate with Mistral for custom training, enabling:

Domain-specific fine-tuning
Knowledge integration from proprietary datasets
Deployment optimization for unique environments
Secure, large-scale enterprise rollout

Custom models provide deeper alignment and higher performance for specialized workloads.

Why Mistral 3 Matters

Mistral 3 is designed for the future of open AI:

Frontier performance, open access
Multimodal + multilingual intelligence across 40+ languages
Scalable architecture from 3B to 675B
Optimized for agentic workflows, reasoning, and tool use

From edge devices to hyperscale deployments, Mistral 3 brings state-of-the-art AI directly to developers and enterprises—without closed-source restrictions.