Request for NVIDIA NIM API Rate Limit Increase (40 → 200 RPM)

Hi @syedammarhussain69, thanks for the detailed write-up.

A quick correction first: the post above this one is an automated triage reply, and despite its phrasing (“I would recommend approving the rate limit increase to 200 RPM”) the bot has no authority to grant rate-limit changes and your request has not been approved. Please don’t read it as such. Apologies for the noise on that one.

On the actual question: this is a NIM question rather than a TensorRT one, and rate-limit changes aren’t granted via forum posts in any category. There’s a pinned post in NVIDIA NIM > Access/Accounts that lays out the policy and the realistic options (self-host the NIM container, or move from trial to AI Enterprise):

Worth reading end-to-end before any next step. I’m moving this thread over to that subcategory so the NIM team sees it in their queue:

In the meantime, agent frameworks like the OpenClaw style usually let you wrap the API client with a token-bucket rate limiter set just below 40 RPM plus an exponential-backoff retry. Adding that around the parallel tool-call layer cuts 429 errors to near zero at the free tier.

⚠️ Heads up: please rotate the API key whose last 4 characters you posted, from https://org.ngc.nvidia.com/setup/personal-keys. Even partial fragments are best kept out of public forum threads.

Thanks, Atharva