Hey @sicerden, I don’t know if this helps, but a moderator previously replied with this to someone asking for the same thing as you:“Many of you are using free tier API access to NVIDIA NIMs. This usually involves a rate limit that is dependent on model, use-case and the amount of current overall traffic using the same access. There is no official way to circumvent this rate limit or to receive a rate limit increase on that same tier. And specifically here on the forums we do not have any influence on those rate limits. To make full use of a NIM blueprint you will need to deploy it. For more details on NVIDIA NIM refer to…”
Basically, this means that if you are using the free-tier API, you have absolutely no right to demand an RPM increase. The only way to legitimately request higher RPM limits is:
Not through the forum. Moderators have said it over and over again: the forum is not the place to request RPM increases, as there are other channels and processes for that.
You need to pay and deploy a model through NVIDIA NIM / NVIDIA Build if you require higher usage limits and production-level access.
First, go to the model you prefer, for example DeepSeek V4 Flash. In that section you’ll see three options: Experience / Model Card / Deploy.
Click on Deploy and you’ll see several options such as Partner Endpoints or Self-hosted Deployments.
From there, you choose the option you want, and it will show you the pricing and deployment costs.
You can start with DeepSeek Flash since it’s the cheapest one, so you can learn how the process works first. And if you need help, you can always contact a moderator privately and ask for guidance. There’s no problem with sending a direct message saying you want to pay and deploy properly.
I am not saying this to be toxic, rude, or disrespectful. My goal is simply to help you understand and follow NVIDIA’s rules and the guidance that has already been provided by moderators multiple times.
IF YOU ARE ALREADY PAYING FOR NVIDIA SERVICES AND DEPLOYED MODELS, THEN YOU SHOULD CONTACT NVIDIA DIRECTLY THROUGH THE APPROPRIATE SUPPORT CHANNELS, SUCH AS EMAIL OR PRIVATE COMMUNICATION WITH THE RELEVANT SUPPORT TEAM, RATHER THAN MAKING RPM INCREASE REQUESTS ON THE FORUM.