Hello NVIDIA API Support Team,
I am an individual developer working on a personal research project involving large-scale instruction/response dataset generation for a specific domain-expert fine-tuning pipeline. I am requesting a rate limit increase to help complete a one-time generation phase for my local model development.
Project Context: I am building a high-precision domain-specific dataset (~100,000 entries) to fine-tune a local 7B parameter model. My pipeline is currently running on moonshotai/kimi-k2-thinking, which is excellent for this task but high in latency.
Technical Hurdle: Because the Kimi-K2 model has long internal “thinking” cycles, I need to run high concurrency (20 workers) to maintain a reasonable generation timeline. However, the 40 RPM ceiling on the developer tier is causing frequent 429 errors which are significantly extending the generation runtime.
Request Details:
-
Email: jsaghbini@outlook.com
-
Current Limit: 40 RPM
-
Requested Limit: 200 RPM
-
Model: moonshotai/kimi-k2-thinking
-
Workload: One-time “burst” generation run (to be completed once the dataset is full)
-
API Endpoint: https://integrate.api.nvidia.com/v1
This is a personal, non-commercial project focused on exploring the limits of specialized dataset generation. Increasing the limit would allow me to finish this specific data generation phase without 429-related interruptions.
Thank you very much for your time and for supporting individual developers on your platform.
Best regards, Joseph Saghbini.