The NIM endpoints for Llama 3.1 405B are unreliable sometimes

When I use the api call sometimes the data doesn’t come. There is some latency beyond normal and it is getting slower. Is there a reason for this? This happens with the other endpoints too from time to time. Is there too many calls from other users at the same time?

Hey dipen1,

Just to confirm: Do you mean the endpoints available on build.nvidia.com?

If so - that’s likely due to load, exactly as you mentioned!

Is there a reliable version of this available. I thought I got enterprise. Today they are not working at all.

Hi @dipen1 – unfortunately, the 405B is currently only being offered as a preview API, and we don’t have enterprise support for it