Connection Refused to ports 8000 and 9234 while running VSS blueprint

I am trying to run the VSS blueprint on a g6e.24xlarge EC2 instance via the docker deployment.

Since it was reported that there was a bug in downloading the vila-1.5 model from within the containers, it was recommended to either switch to using NVILA or downloading the vila-1.5 model manually and the mount it to the container.

I chose the first option of switching to the NVILA model. While docker-compose runs, it errors out saying it cannot connect to applications that are supposed to be running on ports 8000 and 9432.

I cannot determine whether this is caused by switching the VLM or whether I am doing something wrong.

System Information

  • Hardware Platform (GPU model and numbers) : g6e.24xlarge AWS EC2 instance ( 96 vCPUs; 768 GiB RAM; 4x NVIDIA L40S GPU )
  • System Memory : 768 GiB
  • Ubuntu Version : 24.04
  • NVIDIA GPU Driver Version 535.183.01
  • Issue Type : Potential Bug
  • container logs

Steps to Reproduce

  1. Download the AI Blueprint repository
  2. cd docker/remote_llm_deployment
  3. Edit .env with the appropriate values
    • NGC_API_KEY
    • NVIDIA_API_KEY
  4. Change the following values in .env
    • VLM_MODEL_TO_USE=nvila
    • MODEL_PATH=git:https://huggingface.co/Efficient-Large-Model/NVILA-15B
  5. docker compose up

Hi @shinen , we will analyze this as soon as possible. And where did you get this deployment reference method?

The aim was to get the blueprint running within our own infrastructure. So, as we explored the AI blueprint repository, we thought the docker-compose-based deployment could work well for us. But it is not as straight-forward as we thought it would be.

OK. Could you check your security groups by referring to the #9 first?

Does that post not refer to authentication to NIMs? Is that relevant to this issue?

I do not think Security Groups should be a problem because the whole setup is running within docker and a dedicated docker network. There were no containers exposing port 8000. I am not fully sure about port 9234. I would have to get back into the instance and check whether there are any containers exposing port 9234.

I will also expose ports 8000 and 9234 on the Security Groups and see if that resolves the issue.

I will report my observations of this experiment within a few hours.

I followed the Deploy_VSS_docker_Crusoe.ipynb to figure out which step I was missing.

And the step I was missing was described in the notebook ( right at the top of the Deployment step ) :

We will be using Cosmos Nemotron VLM, which is part of the main container. All other models need to be set up before proceeding with the blueprint container. These include:

Now, in the launchable, I had to add one NGC_API_KEY at the top of the notebook and was hands-off for the rest of the notebook ( apart from executing the notebook cells ). That makes me assume that the NIMs should be downloadable using the same NGC_API_KEY. However, when I tried to download the NIMs ( docker pull ) using the same NGC_API_KEY that I used to download the blueprint container, it failed to download all 3 containers with the error :

Error response from daemon: pull access denied for nvcr.io/nim/<nim-image-uri>, repository does not exist or may require 'docker login': denied: requested access to the resource is denied

So, now I understand how the link to the other post that checked the subscriptions. My Enterprise account lists only the VSS EA and not the Developer Program. Is that the reason why I am unable to download the NIMs?
However, that does not explain how the launchable did not have the same problem.


For clarity, my initial problem started with the blueprint unable to connect to applications that were expected on ports 8000 and 9234.

The application that is supposed to be running on port 8000 is the LLM NIM and the application that is supposed to running on port 9234 is the embedding NIM.


Now, the problem is that I cannot figure out how to download the LLM, Embedding and Re-ranker NIMs using the same NGC_API_KEY.

Yes. That might be the reason. Also you need to keep an eye on whether your VSS subscription has expired. Because from your initial log, there was a problem downloading the model, your initial problem may be caused by not having permission too.

[03/07/2025-15:21:43] [TRT-LLM] [E] Failed to load tokenizer from /root/.via/ngc_model_cache/NVILA-15B

Could you please direct me to getting the Developer Program listed in the NGC subscriptions?

Like the user in Build a Video Search and Summarization Agent - #10 by rafael54 , my developer.nvidia.com account lists the Developer membership, but its not reflected in org.ngc.nvidia.com.

My subscription is valid until 2025-05-31.

That was due to my NGC_API_KEY being from the wrong account. I was attempting to use the API key from my personal account rather than the Enterprise account. Resolving that allowed me to download the blueprint containers.

I assumed that this was because I switched to the NVILA LLM rather than the vila-1.5 LLM.

We will confirm this process ASAP.