Hello NVIDIA Support Team,
I am writing to request a rate limit increase for my NVIDIA NIM API account.
Account / Project Details:
- User / Organization Name: hogil.jo
- NVIDIA Account Email: hogil.jo@gmail.com
- API Key ID or Last 4 Characters: nvapi-**********Dmu
- Current Limit: 40 RPM
- Requested Limit: 200 RPM, or the next available tier for individual / developer use
Use Case:
I am using NVIDIA NIM models for Hermes Agent development and testing. The workflow includes:
- Personal / internal AI agent automation testing
- RAG and retrieval workflow experiments
- Model evaluation and debugging
- Scheduled automation jobs with controlled request frequency
Current Issue:
I am frequently receiving HTTP 429 errors:
RuntimeError: HTTP 429: Error code: 429 - {‘status’: 429, ‘title’: ‘Too Many Requests’}
One example is a scheduled Hermes Agent cron job that currently runs every minute during a fixed time window. I understand this may create burst traffic, so I am also reducing concurrency and adding backoff / throttling.
Mitigation Already Planned:
- Reduce cron frequency where possible
- Limit concurrent requests
- Add sleep / exponential backoff after 429 responses
- Avoid unnecessary retries
- Cache repeated results
- Use the API for development, testing, and evaluation only
Reason for Increase:
The current limit is too low for realistic agent workflow testing, because a single agent task can trigger multiple model calls for planning, validation, retries, and tool-use steps. A higher RPM limit would make development and debugging more stable while still staying within controlled usage.
This request is for development / testing purposes only, not for public production traffic.
Thank you for your consideration.
Best regards,