Request for Rate Limit Increase – NVIDIA NIM (40 → 200 RPM) – Hermes Agent Personal Development

Hi @technoul.official, thanks for the detail.

This is a NIM question rather than a TensorRT one, and rate-limit changes aren’t granted via forum posts in any category. There’s a pinned post in NVIDIA NIM > Access/Accounts that lays out the policy and the realistic options (self-host the NIM container, or move from trial to AI Enterprise):

Worth reading end-to-end before any next step. I’m moving this thread over to that subcategory so the NIM team sees it in their queue:

A practical workaround for the RemoteProtocolError stream drops in the meantime: most agent frameworks support a per-request retry with exponential backoff plus a token-bucket rate limiter set just under 40 RPM. Adding that wrapping around the tool-call layer cuts the stream-drop rate to near zero, even at the free-tier ceiling.

Thanks, Atharva