Some models return no output via NVIDIA API (IBM Granite, others) + no API usage dashboard visible

, ,

I created an API key from https://build.nvidia.com/ and successfully integrated it with Open WebUI.

Using this setup, I am able to access and get responses from some models, for example:

  • GPT-OSS 120B

  • Meta LLaMA models

However, I am unable to get any output from certain other models, such as:

  • IBM Granite (and a few others)

Additionally, I do not see any API usage dashboard in my NVIDIA account.
The only information shown is:

Your API Rate Limit: Up to 40 RPM

There is no visibility into:

  • Request counts

  • Token usage

  • Per-model usage or errors