Model Limits

Hello, regarding The NVIDIA API catalog trial experience of NVIDIA NIM. Are there any additional limit for these models besides 40 request per minute limit? Such as token limit or context window limit if there are any.

Hey @Madison89,

We don’t currently publish the limits for each model.

Best,

sophie

Thank you for addressing this matter.

To follow up, would the team be open to sharing whether there are plans to publish the model’s limitations publicly in the future?

I believe clarity on this would greatly benefit the community’s understanding and implementation efforts.

We do not plan to publish specific model limits, since the limits only apply to the APIs which are for trial experiences.

For unlimited usage of NVIDIA NIM, check out NVIDIA AI Enterprise, the Hosted NIM endpoint providers (Together.ai, Baseten, Fireworks) or DGX Cloud Serverless Inference.

I see. Thank you very much on the clarification.