Hi @technoul.official, thanks for the detail.
This is a NIM question rather than a TensorRT one, and rate-limit changes aren’t granted via forum posts in any category. There’s a pinned post in NVIDIA NIM > Access/Accounts that lays out the policy and the realistic options (self-host the NIM container, or move from trial to AI Enterprise):
Worth reading end-to-end before any next step. I’m moving this thread over to that subcategory so the NIM team sees it in their queue:
A practical workaround for the RemoteProtocolError stream drops in the meantime: most agent frameworks support a per-request retry with exponential backoff plus a token-bucket rate limiter set just under 40 RPM. Adding that wrapping around the tool-call layer cuts the stream-drop rate to near zero, even at the free-tier ceiling.
Thanks, Atharva