Mistral Unveils Mistral 3, a New Generation of Open Models Scaling From Edge to Frontier AI
Mistral 3

Credit: Mistral

Today, Mistral announced Mistral 3, a major upgrade to its open model ecosystem featuring three new small dense models (3B, 8B, 14B) and Mistral Large 3, a 41B-active, 675B-total-parameter Mixture-of-Experts model—the company’s most capable release yet. All models are available under the Apache 2.0 license, reinforcing Mistral’s commitment to open, customizable AI for developers and enterprises.

Mistral Large 3: A Frontier-Class Open Model

Trained on 3,000 NVIDIA H200 GPUs, Mistral Large 3 reaches parity with leading instruction-tuned open-weight models and introduces:

  • Sparse MoE architecture: 41B active parameters, 675B total

  • Multimodal understanding, including image comprehension

  • Best-in-class multilingual performance beyond English and Chinese

  • Full open-weight release in both base and instruct variants

Large 3 debuts as #2 among OSS non-reasoning models on the LMArena leaderboard and will soon be joined by a dedicated reasoning version.

To support efficient deployment, Mistral is releasing an NVFP4 checkpoint (via llm-compressor) optimized for vLLM, enabling the model to run on Blackwell NVL72, or even a single node with 8×A100/H100 GPUs.

Accelerated Through Deep Collaboration With NVIDIA, vLLM & Red Hat

Mistral 3 models were trained fully on NVIDIA Hopper GPUs, leveraging HBM3e memory for frontier-scale training. NVIDIA’s extensive co-design approach enabled:

  • Optimized inference with TensorRT-LLM and SGLang

  • Custom kernels for Blackwell attention and MoE

  • Support for disaggregated prefill/decode and speculative decoding

  • Scalable deployments from data center to edge via DGX Spark, RTX GPUs, laptops, and Jetson

Red Hat and vLLM partnered with Mistral to improve accessibility and open-source deployment through optimized serving stacks and community-ready tooling.

Ministral 3: State-of-the-Art Intelligence at the Edge

The new Ministral family (3B, 8B, 14B) includes base, instruct, and reasoning variants—each with native multimodal capabilities. Designed for performance, efficiency, and edge deployment, Ministral 3 models offer:

  • Best cost-to-performance ratio in OSS

  • Substantial reduction in generated tokens for real-world tasks

  • Reasoning variants capable of 85% on AIME 2025 (14B model)

This makes the Ministral series ideal for laptops, mobile devices, robotics, and local/private workloads.

Available Everywhere Developers Build

Mistral 3 launches today on:

  • Mistral AI Studio

  • Amazon Bedrock

  • Azure Foundry

  • Hugging Face (Large 3 + Ministral)

  • IBM WatsonX

  • Modal

  • OpenRouter

  • Unsloth AI

  • Together AI

Coming soon: NVIDIA NIM and AWS SageMaker.

Enterprise Customization With Mistral AI

Organizations can collaborate with Mistral for custom training, enabling:

  • Domain-specific fine-tuning

  • Knowledge integration from proprietary datasets

  • Deployment optimization for unique environments

  • Secure, large-scale enterprise rollout

Custom models provide deeper alignment and higher performance for specialized workloads.

Why Mistral 3 Matters

Mistral 3 is designed for the future of open AI:

  • Frontier performance, open access

  • Multimodal + multilingual intelligence across 40+ languages

  • Scalable architecture from 3B to 675B

  • Optimized for agentic workflows, reasoning, and tool use

From edge devices to hyperscale deployments, Mistral 3 brings state-of-the-art AI directly to developers and enterprises—without closed-source restrictions.