Mistral NeMo – Oh TL;DR

Introducing Mistral NeMo, a powerful 12B model created in collaboration with NVIDIA. It boasts a wide context window of up to 128k tokens and excels in reasoning, world knowledge, and coding accuracy. Effortlessly replace Mistral 7B in any system with Mistral NeMo, thanks to its standard architecture. Notably, Mistral NeMo features pre-trained base and instruction-tuned checkpoints available under the Apache 2.0 license. Innovative features include quantisation awareness for FP8 inference without loss, a multilingual design catering to 11 languages, and a highly efficient tokenizer called Tekken. The model also underwent advanced fine-tuning for improved performance in following instructions and code generation.

https://mistral.ai/news/mistral-nemo/