I just tried out the NIM endpoint and seems like it’s extremely slow. Not sure why.
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Why the models response super slowly? | 4 | 639 | May 10, 2026 | |
| NIM HTTP API Inference (Run Anywhere) Taking Extremely Long! | 1 | 734 | September 11, 2024 | |
| Inferencing models from api taking very long | 1 | 331 | December 19, 2025 | |
| The NIM endpoints for Llama 3.1 405B are unreliable sometimes | 3 | 293 | August 11, 2024 | |
| Bug Report: NVIDIA NIM Hosted Endpoint Reliability Issues - bugs requiring extensive client-side workarounds | 3 | 302 | April 14, 2026 | |
| Give us qwen 3.6 | 1 | 302 | April 22, 2026 | |
| NVIDIA API endpoint | 1 | 95 | May 15, 2026 | |
| Nvidia NIM Inference with Nvidia Hosted model taking very long | 2 | 557 | April 8, 2025 | |
| Models are very very slow | 3 | 745 | November 12, 2025 | |
| Need hosted API access for nvidia/nemotron-3-nano-30b-a3b | 0 | 142 | February 22, 2026 |