Clarification on NVIDIA embedding/reranker API access and costs

Hi NVIDIA team,

We’re using these endpoints in our config:

Questions:

  1. Do these models require a paid NVIDIA API key, or is there a free tier that can be used indefinitely? If there’s a free tier, what are the limits (requests/day, rate limits, model availability)?

  2. If they are paid, what are the costs and where can we find pricing details?

  3. For chat models (e.g., using ChatGPT-4 via NVIDIA’s integrate API), does chat access work immediately with the same key, or is any extra activation/allow-listing required?

Thank you!

Are you running your embedding/reranker model locally? If so, it’s free; it only consumes the computing power of your local GPU.

If you are using Brev, please refer to this; if you are using other cloud services, please consult your service provider. NVIDIA does not directly provide API services.

You need to provide a valid ChatGPT-4 API key in your .env file.

#.env file

NVIDIA_API_KEY=abc123***
NGC_API_KEY=def456***
DISABLE_CV_PIPELINE=true # Set to false to enable CV
INSTALL_PROPRIETARY_CODECS=false # Set to true to enable CV

Refer to this link.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks.