Lower the limits ! your services are overloaded!

Hello,

Since Kimi K2.5 is no longer available, it seems that the other models are now under heavy load. I’ve noticed that some of them generate unusually long responses, especially during the “thinking” phase, with behavior that sometimes resembles flooding (repetitions, excessive character output, etc.), similar to what we previously saw with Kimi k 2.6 “!” outputs.

This significantly impacts readability and overall user experience. It might be worth assessing the impact of certain multi-agent systems, such as OpenClaw and similar solutions, which could be contributing to service overload, particularly on Nvidia NIM and amplifying this issue.

Some form of regulation or targeted limitations could help improve overall stability and response quality.

Please stop to extand the request limit for some users. Lower the limits ! your services are overloaded !!

Thank you for your attention.

3 Likes