Request for NVIDIA NIM API Rate Limit Increase (40 → 200 RPM) – Hermes Agent Development

Hello NVIDIA Support Team,

I am writing to request a rate limit increase for my NVIDIA NIM API account.

Account / Project Details:

  • User / Organization Name: hogil.jo
  • NVIDIA Account Email: hogil.jo@gmail.com
  • API Key ID or Last 4 Characters: nvapi-**********Dmu
  • Current Limit: 40 RPM
  • Requested Limit: 200 RPM, or the next available tier for individual / developer use

Use Case:
I am using NVIDIA NIM models for Hermes Agent development and testing. The workflow includes:

  1. Personal / internal AI agent automation testing
  2. RAG and retrieval workflow experiments
  3. Model evaluation and debugging
  4. Scheduled automation jobs with controlled request frequency

Current Issue:
I am frequently receiving HTTP 429 errors:

RuntimeError: HTTP 429: Error code: 429 - {‘status’: 429, ‘title’: ‘Too Many Requests’}

One example is a scheduled Hermes Agent cron job that currently runs every minute during a fixed time window. I understand this may create burst traffic, so I am also reducing concurrency and adding backoff / throttling.

Mitigation Already Planned:

  • Reduce cron frequency where possible
  • Limit concurrent requests
  • Add sleep / exponential backoff after 429 responses
  • Avoid unnecessary retries
  • Cache repeated results
  • Use the API for development, testing, and evaluation only

Reason for Increase:
The current limit is too low for realistic agent workflow testing, because a single agent task can trigger multiple model calls for planning, validation, retries, and tool-use steps. A higher RPM limit would make development and debugging more stable while still staying within controlled usage.

This request is for development / testing purposes only, not for public production traffic.

Thank you for your consideration.

Best regards,