Request for NVIDIA NIM API Rate Limit Increase (40 → 200 RPM)

Hi NVIDIA Team,

I am currently developing a personal multi-agent AI workflow system for research and learning purposes.

Current rate limit:
40 RPM

Requested:
200 RPM

Use case:

  • Multi-agent orchestration
  • RAG workflows
  • Memory compression
  • Tool calling
  • Planning agents
  • Summarization pipelines

The current 40 RPM limit is difficult for concurrent agent workflows because a single user request may trigger multiple lightweight inference calls.

I have already implemented:

  • local rate limiting
  • request queue
  • concurrency control
  • caching
  • summarization compression

This is for personal development and experimentation only, not for public commercial deployment.

My account E-mail:478347388@qq.com

Thank you for your consideration.