VIA Summarization Workflow ERROR

rawnak.kumar · October 19, 2024, 1:17pm

While running the VIA Summarization model I am getting the below mentioned error:
ERROR We couldn’t connect to ’
https://huggingface.co’
to load this file, couldn’t find it in the cached files and it looks like sentence-transformers/all-MiniLM-L6-v2 is not the path to a directory containing a file named config.json.
ERROR Failed to load VIA pipeline - CA-RAG setup failed. Check if NVIDIA_API_KEY set correctly and/or LLM configuration in CA-RAG config is valid.

Please note:
I have downloaded the VITA 2.0 Model from web and kept the files inside folder: VIA/nvidia_tao_vita_2.0.1_vila-llama-3-8b-lita/ . and given the export NGC_MODEL_CACHE as the same.

If I am downloading the Hugging face model offline. Where i shall keep these files?

yuweiw · October 21, 2024, 6:26am

You can refer to our Guide Using Locally Deployed LLM NIM instead of NVIDIA Hosted LLM NIM to deploy that locally.

rawnak.kumar · October 21, 2024, 8:11am

My Model is not able to find Hugging face config file. I have downloaded the Hugging face model offline, Where Can I place these files?

yuweiw · October 22, 2024, 9:01am

By default, the model will be put in the following path that you configured.

export NGC_MODEL_CACHE=</SOME/DIR/ON/HOST>

rawnak.kumar · October 22, 2024, 9:22am

We have manually downloaded (As our network is restricting them from auto download) both the VITA 2.0 Model and the Hugging Face model and have kept the Hugging face model inside /home/VIA/all-MiniLM-L6-v2 path and my model is inside /home/VIA/nvidia_tao_vita_2.0.1_vila-llama-3-8b-lita/ and my NGC_MODEL_CACHE is /home/VIA/.

What changes I shall do as it is giving below error:

2024-10-21 12:52:51,150 ERROR We couldn’t connect to ‘https://huggingface.co’ to load this file, couldn’t find it in the cached files and it looks like sentence-transformers/all-MiniLM-L6-v2 is not the path to a directory containing a file named config.json.
2024-10-21 12:52:51,150 ERROR Failed to load VIA pipeline - CA-RAG setup failed. Check if NVIDIA_API_KEY set correctly and/or LLM configuration in CA-RAG config is valid.
Killed process with PID 56

yuweiw · October 22, 2024, 9:42am

Could you describe your operation step by step from the beginning?

rawnak.kumar · October 22, 2024, 9:52am

I tried running the VIA Summarization Warehouse Use case using VITA 2.0 model as VLM but our network was not allowing us to download the model, so we downloaded the model and kept it inside folder: /home/VIA/nvidia_tao_vita_2.0.1_vila-llama-3-8b-lita/.
And then we tried running again, it came out with the below error:
2024-10-21 12:52:51,150 ERROR We couldn’t connect to ‘https://huggingface.co’ to load this file, couldn’t find it in the cached files and it looks like sentence-transformers/all-MiniLM-L6-v2 is not the path to a directory containing a file named config.json.
2024-10-21 12:52:51,150 ERROR Failed to load VIA pipeline - CA-RAG setup failed. Check if NVIDIA_API_KEY set correctly and/or LLM configuration in CA-RAG config is valid.
Killed process with PID 56

From above error we feel that again our network has again blocked hugging face model, So we downloaded the model and kept it inside /home/VIA/all-MiniLM-L6-v2 but again it is giving same error.

my NGC_MODEL_CACHE is /home/VIA

sochoa · October 22, 2024, 5:26pm

Hi @rawnak.kumar,

When VIA launches, it expect to find the hugging face model at the following path in the container:

/tmp/huggingface/hub/models--sentence-transformers--all-MiniLM-L6-v2

If you already have the model downloaded, then you can do the following:

Find the location of the downloaded model and make a via-hf-cache folder to place it in. For example:

/home/ubuntu/via-hf-cache/hub/models--sentence-transformers--all-MiniLM-L6-v2

In the docker run command, mount this path to /tmp/huggingface by adding an additional volume

-v /home/ubuntu/via-hf-cache:/tmp/huggingface

When VIA is launched, it should now check the mounted folder for the model and skip the download.

rawnak.kumar · October 23, 2024, 10:08am

We tried with above folder structure but still getting the same error:
2024-10-23 09:55:18,771 INFO Stopping VIA pipeline
2024-10-23 09:55:18,771 ERROR We couldn’t connect to ‘https://huggingface.co’ to load this file, couldn’t find it in the cached files and it looks like sentence-transformers/all-MiniLM-L6-v2 is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode’.
2024-10-23 09:55:18,771 ERROR Failed to load VIA pipeline - CA-RAG setup failed. Check if NVIDIA_API_KEY set correctly and/or LLM configuration in CA-RAG config is valid.
Killed process with PID 56

Do we need to download any other models or can you share which Hugging face model it is actually searching for?

sochoa · October 24, 2024, 12:23am

Hi @rawnak.kumar,

It is trying to pull the following model:

sentence-transformers/all-MiniLM-L6-v2 · Hugging Face

Here is the tree output of the via-hf-cache folder. Your via-hf-cache folder that gets mounted will need to look the same as this.

.
└── hub
    ├── models--sentence-transformers--all-MiniLM-L6-v2
    │   ├── blobs
    │   │   ├── 53aa51172d142c89d9012cce15ae4d6cc0ca6895895114379cacb4fab128d9db
    │   │   ├── 59d594003bf59880a884c574bf88ef7555bb0202
    │   │   ├── 72b987fd805cfa2b58c4c8c952b274a11bfd5a00
    │   │   ├── 8cfec92309f5626a223304af2423e332f6d31887
    │   │   ├── 952a9b81c0bfd99800fabf352f69c7ccd46c5e43
    │   │   ├── c79f2b6a0cea6f4b564fed1938984bace9d30ff0
    │   │   ├── cb202bfe2e3c98645018a6d12f182a434c9d3e02
    │   │   ├── d1514c3162bbe87b343f565fadc62e6c06f04f03
    │   │   ├── e7b0375001f109a6b8873d756ad4f7bbb15fbaa5
    │   │   ├── fb140275c155a9c7c5a3b3e0e77a9e839594a938
    │   │   └── fd1b291129c607e5d49799f87cb219b27f98acdf
    │   ├── refs
    │   │   └── main
    │   └── snapshots
    │       └── ea78891063587eb050ed4166b20062eaf978037c
    │           ├── 1_Pooling
    │           │   └── config.json -> ../../../blobs/d1514c3162bbe87b343f565fadc62e6c06f04f03
    │           ├── config.json -> ../../blobs/72b987fd805cfa2b58c4c8c952b274a11bfd5a00
    │           ├── config_sentence_transformers.json -> ../../blobs/fd1b291129c607e5d49799f87cb219b27f98acdf
    │           ├── model.safetensors -> ../../blobs/53aa51172d142c89d9012cce15ae4d6cc0ca6895895114379cacb4fab128d9db
    │           ├── modules.json -> ../../blobs/952a9b81c0bfd99800fabf352f69c7ccd46c5e43
    │           ├── README.md -> ../../blobs/8cfec92309f5626a223304af2423e332f6d31887
    │           ├── sentence_bert_config.json -> ../../blobs/59d594003bf59880a884c574bf88ef7555bb0202
    │           ├── special_tokens_map.json -> ../../blobs/e7b0375001f109a6b8873d756ad4f7bbb15fbaa5
    │           ├── tokenizer_config.json -> ../../blobs/c79f2b6a0cea6f4b564fed1938984bace9d30ff0
    │           ├── tokenizer.json -> ../../blobs/cb202bfe2e3c98645018a6d12f182a434c9d3e02
    │           └── vocab.txt -> ../../blobs/fb140275c155a9c7c5a3b3e0e77a9e839594a938
    └── version.txt

7 directories, 24 files

Could you share how you have your via-hf-cache folder structured?

rawnak.kumar · October 24, 2024, 11:37am

Our file structure looks something like this that we downloaded from Hugging Face and we have kept these files inside : /home/VIA/via-hf-cache/hub/models–sentence-transformers–all-MiniLM-L6-v2

Do we need any other files?

sochoa · October 24, 2024, 3:49pm

Hi @rawnak.kumar,

Can you run the tree command on your via-hf-cache folder and share the output? It needs to look the same as what I pasted in the previous comment.

rawnak.kumar · October 25, 2024, 11:02am

We were able to solve the Huggingface error problem. Now we are getting below error, We tried giving --privileged=true in docker run command as well:

2024-10-25 09:48:53,051 INFO Stopping VIA pipeline
2024-10-25 09:48:53,052 ERROR Expecting value: line 1 column 1 (char 0)
2024-10-25 09:48:53,052 ERROR Failed to load VIA pipeline - CA-RAG setup failed. Check if NVIDIA_API_KEY set correctly and/or LLM configuration in CA-RAG config is valid.

As per our understanding this is due to network restriction at our organization because of which it is not able to make the call with NVIDIA NIM API

Is there is any possibility to download the model offline and keep it at desired path, then please share the steps, path and the link to download the model.

yuweiw · October 28, 2024, 2:10am

You can refer to the link I attached before Using Locally Deployed LLM NIM instead of NVIDIA Hosted LLM NIM. At the end of page 44, we have instructions on how to deploy locally.
All you need to do is deploy the NIM locally. The documentation explains how to configure the model before you start the docker command.

rawnak.kumar · October 28, 2024, 2:03pm

We tried running the container for llama3-8b-instruct NIM locally but it is giving us the below error:

Exception: error sending request for url (https://authn.nvidia.com/token?scope=group/ngc:nvidia)

Can you please help us in setting up llama3-8b-instruct NIM locally.

yuweiw · October 29, 2024, 2:13am

Did you follow our Guide launch-nvidia-nim-for-llms step by step?

rawnak.kumar · October 29, 2024, 11:00am

Yes, We followed step by step procedure from the documentation but it is giving us error as:
Exception: error sending request for url (https://authn.nvidia.com/token?scope=group/ngc:nvidia)
And we are not able to access the URL: https://authn.nvidia.com/token?scope=group/ngc:nvidia from personal network as well.

Can you please look into this on priority

yuweiw · October 29, 2024, 11:11am

At which step specifically did the error occur?

rawnak.kumar · October 29, 2024, 11:13am

When i am trying to load the llm model locally by using the below docker command:

export NGC_API_KEY=<PASTE_API_KEY_HERE>
export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p “$LOCAL_NIM_CACHE”
docker run -it --rm
–gpus all
–shm-size=16GB
-e NGC_API_KEY=$NGC_API_KEY
-v “$LOCAL_NIM_CACHE:/opt/nim/.cache”
-u $(id -u)
-p 8000:8000
nvcr.io/nim/meta/llama3-8b-instruct:1.0.3

yuweiw · October 30, 2024, 1:30am

This is most likely a problem with your NGC_API_KEY. Did you get this key by referring to the Guide in the link?

Topic		Replies	Views
Error while downloading VIA Visual AI Agent llama	20	283	September 23, 2024
NIM embedding model downloads but fails with auth error on startup Access/Accounts nim , nv-embedqa-e5-v5	29	632	April 10, 2025
VIA Summarization Workflow ERROR Visual AI Agent llama	2	45	October 21, 2024
Build a Video Search and Summarization Agent Visual AI Agent nim , llama	9	140	March 11, 2025
Running NIM llama-3_1-8b-instruct fails in On-Prem deployment Models nim , llama	7	287	April 10, 2025
NIM API key not Found Models nim , llama-31-8b-instruct , llama	4	535	September 21, 2024
Getting Started With NVIDIA NIM Tutorial Issues with NGC Registry Access/Accounts ubuntu , nim , llm , llama3-8b-instruct	7	1385	July 24, 2024
Video Search Summarization Models fail to download Visual AI Agent inception , nim , paligemma , kosmos-2 , llama	3	63	March 7, 2025
[SUPPORT] Workbench Example Project: Hybrid RAG NVIDIA AI Workbench workbench-example-project	95	2022	May 18, 2025
Batch processing using NVIDIA NIM \| Docker \| Self-hosted Models python , nim , llama3-8b-instruct , llama-31-8b-instruct , llama	11	242	January 29, 2025

VIA Summarization Workflow ERROR

Related topics