APIs Unusable

entropychannel February 23, 2026, 9:53am 1

With the influx of new users almost all new models became unusable, even the light and fast ones like Qwen3.5-397B-A17B. Can we expect that to resolve anytime soon? If not, can there at least be more transparency about latest per-model rate limits on each model’s page or any other way to set expectation straight?

Topic		Replies	Views
Some models return no output via NVIDIA API (IBM Granite, others) + no API usage dashboard visible Access/Accounts ai , api , nim	0	123	January 25, 2026
Support new models Models	1	82	November 8, 2025
Model Limits Models nim	4	2922	May 25, 2025
GLM 5 is out! I'd really like to see it among the available models Models	2	790	February 18, 2026
New models in the free API are always overloaded. Why not lower the RPM to 5-10 queries for the most popular models? Models api , nim	4	752	February 14, 2026
The NIM endpoints for Llama 3.1 405B are unreliable sometimes Models nim , llama-31-405b-instruct , llama	3	263	August 11, 2024
Add Qwen3 235B 2507 & Kimi 0905 Models	3	165	November 9, 2025
Google/gemma-3-27b-it is Very slow Models nim	1	559	May 1, 2025
404 Error - Function Not Found for Some Models + How to List Supported Models via API NVIDIA Nemotron	3	267	February 9, 2026
Inferencing models from api taking very long Models jetson , nim , mistral-large , deepseek , nemotron	1	138	December 19, 2025

APIs Unusable

Related topics