Access large models (405B) with NIM after using all credits for the build.nvidia.com endpoints

isaac.woodruff · August 27, 2024, 5:17pm

Hi there,

Is it possible to deploy the large models with NIM such as llama-3.1-405b-instruct and nemotron-4-340b-reward somewhere or get access to more credits for further experimentation with the NeMo Curator. The credits didnt allow for much testing.

Huggingface doesnt allow you to deploy NIMs for models of this size.

Is the solution here to deploy on a cloud provider with the self-hosted method?

Access to this sort of information for further access to NIMs is not easy to find.

Thanks for your help & time

neal.vaidya · August 27, 2024, 5:24pm

Hey @isaac.woodruff – yeah, deploying on a cloud provider with the self-hosted method would be our recommendation at the moment. You can take a look at the “Deploy with Helm” docs here: Deploying with Helm - NVIDIA Docs which goes into deploying multinode models like those two.

Let me know if you have any questions or run into any issues!

isaac.woodruff · August 27, 2024, 6:15pm

Thanks for the fast reply.

Are there any other fast solutions for deploying some endpoints for these larger models similar to what’s available in HuggingFace for the smaller models? It would be more ideal to have something closer to “pay-as-you-go” endpoints than configuring multi-gpu clusters with tons of storage on a cloud provider…

@neal.vaidya
@calexiuk

arthur39 · August 29, 2024, 6:47pm

I would appreciate if we could buy more credits as well.

Topic		Replies	Views
Is it currently possible to deploy our own models on NVIDIA's cloud and use NIM for inference? Models nim	2	175	July 24, 2024
Support for vision models after enterprise 4000 credits are exhausted - onboarding on paid subscription TensorRT nim , phi-3-vision-128k-instruct , llama	0	48	October 23, 2024
Need more credits for NIM cloud API Access/Accounts nim	3	1437	April 16, 2025
Unable to Run NIM on H100 GPU Due to Profile Compatibility Issue Despite Sufficient GPU Resources Models nim , llama-31-8b-instruct , llama	1	193	November 12, 2024
NIM Llama3 8B Instruct - Running container with "CUDA_ERROR_NO_DEVICE" cuDNN docker , nim , llama3-8b-instruct	1	31	March 28, 2025
Can I use models like Llama-3-8B-Instruct-Coder with NIM? Models nim , llama	1	59	September 20, 2024
How to Deploy and Run an LLM Designed with the 'NVIDIA NeMo Framework' and 'NVIDIA Megatron' AI Foundation Models and Endpoints nemo	3	149	February 21, 2025
A Simple Guide to Deploying Generative AI with NVIDIA NIM Technical Blog nim	9	715	September 8, 2024
Publish app using NVIDIA NIM on Huggingface Models nim	1	71	September 30, 2024
NIM for finetunning/custom models? NGC GPU Cloud	1	1257	June 5, 2024

Access large models (405B) with NIM after using all credits for the build.nvidia.com endpoints

Related topics