Thank you to everyone who attended our recent webinar “Build Visual AI Agents With Generative AI and NVIDIA NIM”. We’ve compiled the most frequently asked questions from the session, along with detailed answers, to help clarify any additional points and provide further insights. Feel free to continue the conversation below! If you missed the webinar, you can still sign up here to watch it on-demand.
Q: What is a NIM?
A: NVIDIA NIM™ is a set of easy-to-use microservices designed for secure, reliable deployment of high performance AI model inferencing across the cloud, data center and workstations. Supporting a wide range of AI models, including open-source community and NVIDIA AI Foundation models, it ensures seamless, scalable AI inferencing, on-premises or in the cloud, leveraging industry standard APIs All NIM microservices and associated preview APIs can be found at build.nvidia.com.
Q: How do I get credits for build.nvidia.com?
A: All users can get started for free with the preview APIs on build.nvidia.com. Each new account can receive up to 5000 credits to try out the APIs. To continue development after credits run out, you can deploy the downloadable NIM microservices locally to your hardware or to a cloud instance. Developers can also access NIM via the NVIDIA Developer program. Please see details in this FAQ.
Q: Can I run NIM microservices using cloud services like AWS?
A: Yes! NIM microservices are docker containers that can be deployed on any local or cloud system with compatible hardware. Guides for cloud deployment can be found on this page.
Q: Do I have to pay to use a downloadable NIM?
A: Downloadable NIM microservices require an NVIDIA AI Enterprise License (NVAIE). To learn more and try for free visit this page.
Q: Where can I find information about the datasets used to train NIM models?
A: On build.nvidia.com, when you click into a NIM, a “Model Card” tab is available that discusses model details such as the data used for training.
Q: Are the models safe and ethical? Are they tested for bias or discrimination?
A: Information related to model safety and ethics can be found on the “Model Card” tab of the NIM on build.nvidia.com. Additionally NeMo Guardrails can be used to filter LLM responses.
Q: Are there NIM microservices for healthcare use cases?
A: Yes, all healthcare related NIM microservices can be found at build.nvidia.com/explore/healthcare
Q: How can NIM microservices be trained or fine tuned?
A: LLM NIM microservices support LoRA PEFT adapters trained by the NeMo Framework and Hugging Face Transformers libraries. When submitting inference requests to the NIM, the server supports dynamic multi-LoRA inference, enabling simultaneous inference requests with different LoRA models. More details can be found on this page. Similar documentation will be made available when fine tuning for Vision NIM microservices is available.
Q: What hardware is required to run NIM microservices and what is the expected performance?
A: The hardware requirements depend on the NIM. A guide for LLM NIM system requirements can be found on this page. System requirements for Vision NIM microservices will be made available when they are available to be downloaded.
Q: How do I get started using NIM microservices and VLMs?
A: To get started with NIM microservices, visit build.nvidia.com to make an account and explore the available NIM microservices.
For reference applications with NIM microservices, visit the following GitHub repositories:
- NVIDIA/metropolis-nim-workflows: Collection of reference workflows for building intelligent agents with NIM microservices (github.com)
- NVIDIA/GenerativeAIExamples: Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture. (github.com)
- NVIDIA NIM Agent Blueprints (github.com)
Q: How can I get technical support when developing with NIM microservices?
A: The NIM developer forum is the best place to ask questions and engage with our developer community. You can access the forums at this page.
Q: How do I get started with VIA?
A: VIA is currently in early access. To get started with VIA, first apply for access on this page. Once accepted, the early access portal will provide getting started resources.
Q: What hardware is required to run VIA?
A: VIA has several deployment options and allows use of models running in the cloud such as NIM microservices and OpenAI compatible APIs. To run with cloud resources, your local system can use a consumer level RTX GPU. To run VIA and all models locally, more powerful GPUs are needed such as an A6000, L40S, A100 or H100.
Q: How fast can VIA summarize a video?
A: VIA is optimized to summarize long videos and perform parallel processing across all available GPUs. For example, with an 8xH100 cluster with chunk size set to 60 seconds, VIA can summarize a 50 minute video in 50 seconds! For more details checkout this blog.
Q: How can I get technical support when developing with VIA?
A: The VIA developer forum is actively monitored and is the best place to ask questions. You can access the forums at this page.