NVIDIA ACE: Model Archive

Access previous NVIDIA ACE for Games models

Llama3.2-3B-Instruct

Agentic small language model that enables better role-play, retrieval-augmented generation (RAG) and function calling capabilities. This model is compatible with multi-vendor GPUs and CPUs.

Access Model Card

Download On-Device Model

Access Cloud Model

Documentation

Mistral-Nemo-Minitron Family

Agentic small language models that enable better role-play, retrieval-augmented generation (RAG) and function calling capabilities. They come in 8B, 4B and 2B parameter models to fit your VRAM and performance requirements. The on-device models are compatible with multi-vendor GPUs and CPUs.

Access Model Card

Download On-Device 2B Model

Download On-Device 4B Model

Download On-Device 8B Model

Documentation

Nemovision-4B-Instruct

Agentic vision-language model that combines visual understanding of on-screen elements and actions and reasons for better context aware responses. The on-device model is compatible with multi-vendor GPUs and CPUs.

Access Model Card

Download On-Device Model

Documentation

Riva TTS

Takes a text output and converts it into natural and expressive voices in multiple languages in real time. Built for agentic workflows and compatible with multi-vendor GPUs and CPUs. FP16 quantization offers higher accuracy for higher VRAM usage.

Access Model Card

Download On-Device Model (FP16)

Download On-Device Model (Q4)

Access Cloud Model

Documentation

Whisper ASR

Takes an audio stream as input and returns a text transcript in real time. It’s compatible with multi-vendor GPUs and CPUs..

Access Model Card

Download On-Device Model

Access Cloud Model

Documentation