Did you watch NVIDIA CEO Jensen Huang’s keynote at COMPUTEX 2025?
Following the exciting announcements, we’re excited to share that the NVIDIA AI Blueprint for video search and summarization (VSS) —part of the NVIDIA Metropolis platform—is now generally available!
The VSS blueprint brings together generative AI, VLMs, LLMs, RAG , and media management services to create visually perceptive, interactive AI agents for advanced video analytics.
Here’s what’s new in the latest release
- Single GPU deployments and additional hardware support - Support for more GPUs and ability to deploy the blueprint on a single GPU - H100, A100, H200
- Audio transcription - Utilize the video audio to store speech-to-text transcripts for multi-modal understanding of the scene.
- Multi-live stream and burst clip modes - Process hundreds of simultaneous live-streams or stored video files
- Computer vision pipeline - Enhance accuracy by tracking objects within the scene through zero-shot object detection and utilizing bounding boxes and segmentation masks with Set-of-Mark prompting.
- CA-RAG accuracy and performance improvements - Improved performance with batched summarization and entity extraction along with Graph-RAG optimizations.
Want to dive deeper? Check out the technical blog and explore the resources below:
Additional resources
- Preview the blueprint
- Deploy on NVIDIA Launchable. Watch the Instructional video
- GIthub repo
- Documentation
- Announcement blog
