Request for NVIDIA NIM API Rate Limit Increase (40 → 200 RPM) – Personal Assistant & Agentic Coding with Hermes

Hello NVIDIA Support Team,

I am writing to respectfully request a rate limit increase for my NVIDIA NIM API account.

  • My NVIDIA Account Email: ramnew2006@gmail.com
  • Current Limit: 40 RPM
  • Requested Limit: 200 RPM (or the next available tier for individual developer use)

Project Overview:
I am using the Hermes Agent (Nous Research) as my daily personal AI assistant and for agentic coding on personal projects. I primarily use strong reasoning models such as GLM-4.7 / GLM-5.1 via the NVIDIA NIM preview endpoints.

My typical workflows include:

  • Multi-step personal assistant tasks (planning, tool calling, memory management, research, and execution)
  • Autonomous coding agents for personal software projects (code generation, debugging, refactoring, testing loops)
  • Iterative reasoning chains that often require rapid sequential or parallel API calls

Issue with Current Limits:
The default 40 RPM limit is quickly exhausted during normal usage, triggering frequent 429 Too Many Requests errors. This significantly disrupts the agent’s flow and makes productive work difficult, even with exponential backoff and careful task sequencing.

Intended Use:
This increase is strictly for personal, non-commercial development and daily productivity. I am not using it for production services or high-traffic applications.

I would greatly appreciate any consideration for increasing my rate limit to 200 RPM. This would allow me to fully leverage the excellent preview models (especially GLM series and Nemotron variants) for meaningful agentic workflows.

Thank you for your time and for providing such a valuable free inference platform.

Best regards,
Ram Manohar