Mistral 3 arrives: an open, multimodal family from 3B to 675B parameters
Mistral AI has released Mistral 3, a new family of open-source models under Apache 2.0, spanning compact edge-friendly weights and a frontier-scale sparse model. The lineup includes three dense Ministral 3 models (3B, 8B, 14B) and a sparse mixture-of-experts, Mistral Large 3, with 41B active / 675B total parameters. All models are available today.
Mistral Large 3 — an open MoE trained at scale
Mistral Large 3 is a sparse MoE trained from scratch on 3,000 NVIDIA H200 GPUs. The release includes both base and instruction-fine-tuned checkpoints, with a reasoning variant to follow. After post-training, the model reaches parity with leading instruction-tuned open-weight models on general prompts, while also showing strong image understanding and multilingual conversation capabilities. On LMArena, Mistral Large 3 debuts at #2 in the OSS non-reasoning models category and #6 among OSS models overall: https://lmarena.ai/leaderboard/text
For deployment and experimentation, an optimized NVFP4 checkpoint is released via llm-compressor (https://github.com/vllm-project/llm-compressor), enabling efficient operation on Blackwell NVL72 systems and on a single 8×A100 or 8×H100 node using vLLM (https://github.com/vllm-project/vllm).
Industry co-optimization for training and inference
The release emphasizes co-design with industry partners. NVIDIA integration includes support for TensorRT-LLM (https://github.com/NVIDIA/TensorRT-LLM) and SGLang (https://github.com/sgl-project/sglang) for low-precision inference, plus Blackwell attention and MoE kernels and speculative decoding to improve long-context, high-throughput serving on GB200 NVL72 and similar platforms. Edge and local deployment paths are highlighted for DGX Spark (http://nvidia.com/en-us/products/workstations/dgx-spark/), RTX PCs and laptops (https://www.nvidia.com/en-us/ai-on-rtx/), and Jetson devices (https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/).
Ministral 3 — compact, multimodal, multilingual
The Ministral 3 series targets edge and local inference scenarios with 3B, 8B, and 14B sizes. Each size ships in base, instruct, and reasoning variants, and all variants include image understanding and support for 40+ native languages. The Ministral instruct models emphasize cost-to-performance, often producing fewer generated tokens while matching or exceeding comparable models. For tasks prioritizing accuracy, the Ministral 14B reasoning variant reports 85% on AIME ’25.
Availability, customization, and documentation
Mistral 3 is available now via multiple platforms, including:
- Mistral AI Studio: https://console.mistral.ai/home
- Amazon Bedrock and Azure Foundry
- Hugging Face — Large 3 & Ministral
- Modal: https://modal.com/docs/examples/ministral3_inference
- IBM WatsonX, OpenRouter, Fireworks, Unsloth AI (https://docs.unsloth.ai/new/ministral-3), and Together AI
- Coming soon: NVIDIA NIM and AWS SageMaker
Model documentation and technical resources:
- Ministral 3 3B-25-12: https://docs.mistral.ai/models/ministral-3-3b-25-12
- Ministral 3 8B-25-12: https://docs.mistral.ai/models/ministral-3-8b-25-12
- Ministral 3 14B-25-12: https://docs.mistral.ai/models/ministral-3-14b-25-12
- Mistral Large 3: https://docs.mistral.ai/models/mistral-large-3-25-12
- Technical governance materials: https://legal.mistral.ai/
For organizations pursuing tailored deployments, Mistral AI offers custom model training services: https://mistral.ai/solutions/custom-model-training
Community and next steps
Model checkpoints, instruction-tuned variants, and deployment tooling are provided to support research, engineering, and enterprise integration. Further resources and community channels are listed on the documentation pages and platform listings linked above.
Original source: https://mistral.ai/news/mistral-3
