Request for Rate Limit Increase – NVIDIA NIM (40 → 200 RPM) – Hermes Agent Personal Development

Hello NVIDIA Support Team,

I am writing to request a rate limit increase for my NVIDIA NIM API account to support my personal development work. I am an AI enthusiast building a personal knowledge management and autonomous agent system by orchestrating Hermes Agent with an Obsidian vault backend.

Current limit: 40 RPM
Requested limit: 200 RPM (or the next available tier for individual developer use)

The current 40 RPM limit causes frequent RemoteProtocolError stream drops during agentic workflows because Hermes makes multiple API calls per conversation turn. This severely hinders development progress and system debugging.

This request is solely for personal development/testing, not for commercial production use.

Thank you for your support!

Hi @technoul.official, thanks for the detail.

This is a NIM question rather than a TensorRT one, and rate-limit changes aren’t granted via forum posts in any category. There’s a pinned post in NVIDIA NIM > Access/Accounts that lays out the policy and the realistic options (self-host the NIM container, or move from trial to AI Enterprise):

Worth reading end-to-end before any next step. I’m moving this thread over to that subcategory so the NIM team sees it in their queue:

A practical workaround for the RemoteProtocolError stream drops in the meantime: most agent frameworks support a per-request retry with exponential backoff plus a token-bucket rate limiter set just under 40 RPM. Adding that wrapping around the tool-call layer cuts the stream-drop rate to near zero, even at the free-tier ceiling.

Thanks, Atharva