NVIDIA and Mistral co-developed Mistral NeMo 12B, a state-of-the-art model that excels across various benchmarks, including common sense reasoning, world knowledge, coding, math, and multilingual conversations. See benchmark details here.
-
128K Context Window: - Dense transformer model with a 128K context length for enhanced understanding and processing of complex information.
-
Training Data: Trained on Mistral’s proprietary dataset, featuring a large proportion of multilingual and code data.
-
Training Optimizations: Utilizes NVIDIA Megatron-LM, part of NVIDIA NeMo for efficient large-scale training on NVIDIA DGX Cloud.
-
Inference Optimizations: Enhanced with NVIDIA TensorRT-LLM engines for higher performance, including optimizations like in-flight batching, KV caching, and FP8 support.
Deployment:
-
NVIDIA NIM: Packaged as an NVIDIA NIM inference microservice, enabling streamlined deployment across platforms with high-throughput inference.
-
Use cases: Ideal for tasks such as document summarization, classification, multi-turn conversations, language translation, and code generation.
-
Open Licensing: Available under Apache 2.0 license, allowing customization and integration into commercial applications.
Getting Started:
Experience Mistral NeMo NIM by visiting ai.nvidia.com and utilize free NVIDIA cloud credits to test and build proofs of concept.