New models in the free API are always overloaded. Why not lower the RPM to 5-10 queries for the most popular models?

,

I understand we shouldn’t expect much from a free API intended for testing, but constant overload makes any use of new models impossible. If providing more computing power isn’t feasible, why not lower the RPM to 5-10? Reducing the limit to these values ​​would have little impact on regular users, but would significantly reduce the potential for abuse of the free API.

This isn’t a complaint; I’m just curious if you plan to address the overload somehow.

Kimi 2.5 continues to be overloaded 24/7. It’s frustrating.

Same. I can use Kimi K2.5 via the web interface just fine but the API endpoint would not work for some reason.

More and more models are becoming overloaded every day. Is Nvidia really not going to do anything about it?

It looks like Nvidia has begun efforts to reduce the RPM for high-demand models. Hopefully, this will help make these models more accessible to everyone.