![Mistral 3]()
Credit: Mistral
Today, Mistral announced Mistral 3, a major upgrade to its open model ecosystem featuring three new small dense models (3B, 8B, 14B) and Mistral Large 3, a 41B-active, 675B-total-parameter Mixture-of-Experts model—the company’s most capable release yet. All models are available under the Apache 2.0 license, reinforcing Mistral’s commitment to open, customizable AI for developers and enterprises.
Mistral Large 3: A Frontier-Class Open Model
Trained on 3,000 NVIDIA H200 GPUs, Mistral Large 3 reaches parity with leading instruction-tuned open-weight models and introduces:
Sparse MoE architecture: 41B active parameters, 675B total
Multimodal understanding, including image comprehension
Best-in-class multilingual performance beyond English and Chinese
Full open-weight release in both base and instruct variants
Large 3 debuts as #2 among OSS non-reasoning models on the LMArena leaderboard and will soon be joined by a dedicated reasoning version.
To support efficient deployment, Mistral is releasing an NVFP4 checkpoint (via llm-compressor) optimized for vLLM, enabling the model to run on Blackwell NVL72, or even a single node with 8×A100/H100 GPUs.
Accelerated Through Deep Collaboration With NVIDIA, vLLM & Red Hat
Mistral 3 models were trained fully on NVIDIA Hopper GPUs, leveraging HBM3e memory for frontier-scale training. NVIDIA’s extensive co-design approach enabled:
Optimized inference with TensorRT-LLM and SGLang
Custom kernels for Blackwell attention and MoE
Support for disaggregated prefill/decode and speculative decoding
Scalable deployments from data center to edge via DGX Spark, RTX GPUs, laptops, and Jetson
Red Hat and vLLM partnered with Mistral to improve accessibility and open-source deployment through optimized serving stacks and community-ready tooling.
Ministral 3: State-of-the-Art Intelligence at the Edge
The new Ministral family (3B, 8B, 14B) includes base, instruct, and reasoning variants—each with native multimodal capabilities. Designed for performance, efficiency, and edge deployment, Ministral 3 models offer:
Best cost-to-performance ratio in OSS
Substantial reduction in generated tokens for real-world tasks
Reasoning variants capable of 85% on AIME 2025 (14B model)
This makes the Ministral series ideal for laptops, mobile devices, robotics, and local/private workloads.
Available Everywhere Developers Build
Mistral 3 launches today on:
Coming soon: NVIDIA NIM and AWS SageMaker.
Enterprise Customization With Mistral AI
Organizations can collaborate with Mistral for custom training, enabling:
Domain-specific fine-tuning
Knowledge integration from proprietary datasets
Deployment optimization for unique environments
Secure, large-scale enterprise rollout
Custom models provide deeper alignment and higher performance for specialized workloads.
Why Mistral 3 Matters
Mistral 3 is designed for the future of open AI:
Frontier performance, open access
Multimodal + multilingual intelligence across 40+ languages
Scalable architecture from 3B to 675B
Optimized for agentic workflows, reasoning, and tool use
From edge devices to hyperscale deployments, Mistral 3 brings state-of-the-art AI directly to developers and enterprises—without closed-source restrictions.