NVIDIA Maxine Release Highlights
The Maxine December 2024 release includes exciting updates to help you accelerate your application with NVIDIA technologies.
For more information on Maxine SDKs, NIM Microservices, and Microservices, along with the details of this release, please visit the NVIDIA NGC Catalog.
What’s New in Maxine SDKs?
-
All Linux SDKs updated with TensorRT 10.4.0.26, CUDA 12.1.1, CuDNN 8.9.7
-
All Windows SDKs updated with TensorRT 10.4.0.26 and CUDA 12.1.1
- Windows SDKs (available as part of the Maxine Early Access Program in this release) now support per-feature and per-GPU model downloads from NGC, which means models are no longer included in the package, leading to significant savings in package size.
- These updates aim to enhance performance and reduce the overall footprint of the SDKs, making them more efficient and easier to manage.
- Updated GPU support matrix: see the supported GPUs at the end of this document.
-
Maxine Audio Effects SDK - v1.6.0
-
Included with NVAIE license:
-
Linux SDKs
-
New features now publicly available:
- BNR 2.0:
- Improved perceptual quality for BNR 2.0 16 kHz and 48 kHz model
- Better noise suppression in non-speech segments
- VISQOL score improvements reflect enhanced audio quality
v1.5.2 : 16Khz scored 2.93, 28kHz scored 3.33
v1.6.0 : 16Khz scored 3.33, 48kHz scored 3.43
- BNR 2.0:
-
Studio Voice:
- Improved quality for Studio Voice Low Latency (48kHz) model. This will enhance input speech recorded through low quality microphones in noisy and reverberant environments to studio-recorded quality speech
- NISQA scores improved to 4.43 (v1.5.2: 4.28, higher is better)
- Algorithmic latency reduced to 80 ms (v1.5.2: 90 ms)
-
-
Included in the Maxine Early Access Program:
- Windows and Linux SDKs
- All features available under NVAIE license, and additional features only available in the Early Access SDK:
- Voice Font: converts the input voice to match the reference speaker’s voice while keeping linguistic information and prosody unchanged
- Speaker Focus: identifies and isolates the primary speaker and removes all other speakers from the input audio. This significantly improves the intelligibility of the primary speaker’s voice when others are speaking in the background
-
-
Maxine Video Effects SDK - v0.7.5
-
Included with NVAIE license:
- Linux SDKs
- Feature improvements:
- Video noise reduction: improved color noise removal and preservation of fine details
- Virtual background: two new modes offered to segment the chair as foreground or background
-
Included in the Maxine Early Access Program:
- Windows and Linux SDKs
- All features available under NVAIE license and the following new feature only available in the Early Access SDK:
- Video Relighting: this innovative feature is designed to enhance lighting for video conferencing and content creation, providing a more professional and polished appearance
- Enhanced Lighting: Video Relighting allows you to adjust and improve the lighting on yourself in videos according to any given HDRi, ensuring that you always look your best, regardless of your environment.
- Real-time Performance: The Relighting feature has been optimized to deliver real-time performance on high-end GPUs such as the GeForce RTX 4090 and L40, providing a smooth and seamless experience.
- Superior Image Quality: With Relighting, you can expect improved clarity and contrast in your videos. The feature also reduces light leaks and unwanted shadows on your face and body.
- Video Relighting: this innovative feature is designed to enhance lighting for video conferencing and content creation, providing a more professional and polished appearance
-
-
Maxine Augmented Reality SDK - v0.8.6
-
Included with NVAIE license:
- Linux SDKs
- Feature improvements
- Eye Contact: model updates that improve feature performance by ~35%
- Audio2Face-2D:
- Improved animation fidelity and robustness including more stable head pose and gaze, as well as face shape preservation
- Improved performance of quality mode bringing real-time support to A40 GPUs in addition to L40 GPUs while reducing latency by ~30%
- Introduced multi-stream support with Triton
-
Included in the Maxine Early Access Program
- Windows and Linux SDKs
- All features available under NVAIE license and the following new feature only available in the Early Access SDK:
- Maxine 3D: One-shot AI solution to infer and render photo-realistic 3D representations from a single portrait image. Effortlessly transform your face from 2D video into 3D rendered on any display: standard 2D displays, stereoscopic, or light field 3D displays
-
What’s New in Maxine NIM Microservices?
NVIDIA NIM offers prebuilt containers for AI models across computer vision, audio, LLMs, and more. Each NIM consists of a container and a model and uses a CUDA-accelerated runtime for all NVIDIA GPUs, with special optimizations available for many configurations. Whether on-premises or in the cloud, NIM is the fastest way to achieve accelerated inference at scale.
- New Studio Voice NIM:
- Studio Voice NIM: Provides accelerated performance for real-time speech enhancement through a state-of-the-art AI model. The Studio Voice NIM enhances input speech recorded through low quality microphones in noisy and reverberant environments to studio-recorded quality speech
SDK GPU Support
Linux : Volta, Turing, Ampere. Ada and Hopper
Windows : Ada. Ampere and Turing
*Please note: Video Relighting and Maxine 3D features in the Linux SDKs are not currently supported on A2 and A16 GPUs. GTX Turing GPUs are not supported - ie. GTX 1650. T series professional cards are not support - ie T400, T600.