Hi Team,
We are currently experiencing 429 (Too Many Requests) response codes when using the NVIDIA NIM cloud API. Previously, the same integration was working smoothly in our project. However, we are now encountering this issue even with as few as 10 requests in a loop.
Could you please confirm if there have been any recent changes in rate limits or if there is an ongoing issue on your end? This will help us make the necessary adjustments in our implementation.
API Endpoint: https://integrate.api.nvidia.com/v1/chat/completions
Looking forward to your response.
Below attached a curl request to 10 request :
Hi,
We are currently seeing a range of issues with NGC services, including API Endpoints.
You can track the issue here https://status.ngc.nvidia.com/.
If your error persists once the issue has been resolved please reach back out and we will look into this for you.
Best,
Sophie
Hi Sophie,
The status shows as issues with NGC services are resolved but I still see issue when hitting multiple request to Nvidia API endpoints, I get 429 response code (Too many requests).
Please update if Nvidia NGC services are completely up and operational.
NGC Status
Response :
{“id”:“chat-2a7b029f403741aba5f99022486bcdc5”,“object”:“chat.completion”,“created”:1749628365,“model”:“meta/llama-3.1-8b-instruct”,“choices”:[{“index”:0,“message”:{“role”:“assistant”,“content”:“Hello! How can I assist you today?”},“logprobs”:null,“finish_reason”:“stop”,“stop_reason”:null}],“usage”:{“prompt_tokens”:11,“total_tokens”:20,“completion_tokens”:9},“prompt_logprobs”:null}{“status”:429,“title”:“Too Many Requests”}
{“status”:429,“title”:“Too Many Requests”}{“status”:429,“title”:“Too Many Requests”}{“status”:429,“title”:“Too Many Requests”}
{“status”:429,“title”:“Too Many Requests”}{“status”:429,“title”:“Too Many Requests”}{“status”:429,“title”:“Too Many Requests”}{“status”:429,“title”:“Too Many Requests”}{“status”:429,“title”:“Too Many Requests”}
Hi @sumit.mehta,
I’m trying to get clarity on whether there have been changes to the trial API for the model you are using.
In the mean time, you could try adding a sleep
command in your loop iterations.
(I see the same ‘429 Too many requests’ error when I run your code.)
Please note that the API Endpoints are only to be used for experimentation, development, testing and research. NVIDIA NIM FAQ
Best,
Sophie