Hi, i am trying out the VIA on my local machine. However, after few retries on downloading the VITA model, the download kept on failing
Getting files to download...
⠋ ━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━━━━━━ • 7.3/16.6 GiB • Remaining: 0:13:30 • 12.3 MB/s • Elapsed: 0:55:44 • Total: 15 - Completed: 13 - Failed: 2
------------------------------------------------------------
Download status: FAILED
Downloaded local path model: /tmp/tmpd6oywkr0/vita_v2.0.1
Total files downloaded: 13
Total transferred: 7.34 GB
Started at: 2024-08-20 04:04:48
Completed at: 2024-08-20 05:00:32
Duration taken: 55m 44s
------------------------------------------------------------
2024-08-20 05:00:36,637 INFO Downloaded model to /root/.via/ngc_model_cache/nvidia_tao_vita_2.0.1_vila-llama-3-8b-lita
2024-08-20 05:00:36,639 INFO TRT-LLM Engine not found. Generating engines ...
Selecting FP16 mode
Converting Checkpoint ...
[TensorRT-LLM] TensorRT-LLM version: 0.10.0.dev2024043000
0.10.0.dev2024043000
Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/opt/nvidia/via/via-engine/models/vita20/trt_helper/convert_checkpoint.py", line 447, in <module>
main()
File "/opt/nvidia/via/via-engine/models/vita20/trt_helper/convert_checkpoint.py", line 439, in main
convert_and_save_hf(args)
File "/opt/nvidia/via/via-engine/models/vita20/trt_helper/convert_checkpoint.py", line 356, in convert_and_save_hf
hf_model = preload_model(model_dir) if not args.load_by_shard else None
File "/opt/nvidia/via/via-engine/models/vita20/trt_helper/convert_checkpoint.py", line 317, in preload_model
model = AutoModelForCausalLM.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 561, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3502, in from_pretrained
) = cls._load_pretrained_model(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3903, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 505, in load_state_dict
with safe_open(checkpoint_file, framework="pt") as f:
FileNotFoundError: No such file or directory: "/root/.via/ngc_model_cache/nvidia_tao_vita_2.0.1_vila-llama-3-8b-lita/model-00001-of-00004.safetensors"
ERROR: Failed to convert checkpoint
2024-08-20 05:00:40,831 ERROR Failed to load VIA pipeline - Failed to generate TRT-LLM engine
Killed process with PID 50
Besides, i am trying the OPENAI API method, but it requires GPT-4o to use it. Do we need to subscription for the GPT-4o to run VIA?
Could you describe your operating steps in detail? Have you run the export NGC_MODEL_CACHE=</SOME/DIR/ON/HOST> command? You need to change the </SOME/DIR/ON/HOST> to a real dir on your host.
I am running the VIA using VITA model. However, everytime i run the application it starts downloading the VITA model and fail. Thus, I went to NGC to download and extract the model myself from here using wget
After downloading the model and extract the model into /home/User/Desktop/VIA path, may i know what command should i use to run the docker container. Should i replace the
export MODEL_PATH="ngc:nvidia/tao/vita:2.0.1"
to
export MODEL_PATH="/home/User/Desktop/VIA"
------------------------------------------------------------
Download status: FAILED
Downloaded local path model: /tmp/tmp95v9pgqw/vita_v2.0.1
Total files downloaded: 13
Total transferred: 7.42 GB
Started at: 2024-08-21 06:09:01
Completed at: 2024-08-21 07:11:14
Duration taken: 1h 2m 12s
------------------------------------------------------------
2024-08-21 07:11:18,176 INFO Downloaded model to /root/.via/ngc_model_cache/nvidia_tao_vita_2.0.1_vila-llama-3-8b-lita
2024-08-21 07:11:18,177 INFO TRT-LLM Engine not found. Generating engines ...
Selecting FP16 mode
Converting Checkpoint ...
[TensorRT-LLM] TensorRT-LLM version: 0.10.0.dev2024043000
0.10.0.dev2024043000
Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/opt/nvidia/via/via-engine/models/vita20/trt_helper/convert_checkpoint.py", line 447, in <module>
main()
File "/opt/nvidia/via/via-engine/models/vita20/trt_helper/convert_checkpoint.py", line 439, in main
convert_and_save_hf(args)
File "/opt/nvidia/via/via-engine/models/vita20/trt_helper/convert_checkpoint.py", line 356, in convert_and_save_hf
hf_model = preload_model(model_dir) if not args.load_by_shard else None
File "/opt/nvidia/via/via-engine/models/vita20/trt_helper/convert_checkpoint.py", line 317, in preload_model
model = AutoModelForCausalLM.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 561, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3502, in from_pretrained
) = cls._load_pretrained_model(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3903, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 505, in load_state_dict
with safe_open(checkpoint_file, framework="pt") as f:
FileNotFoundError: No such file or directory: "/root/.via/ngc_model_cache/nvidia_tao_vita_2.0.1_vila-llama-3-8b-lita/model-00001-of-00004.safetensors"
ERROR: Failed to convert checkpoint
2024-08-21 07:11:22,407 ERROR Failed to load VIA pipeline - Failed to generate TRT-LLM engine
Have you got the NVIDIA API Key and the NGC API Key by referring to our Guide?
If you download the model yourself, could you try to create the nvidia_tao_vita_2.0.1_vila-llama-3-8b-lita dir in your /home/User/Desktop/VIA dir and put the model inside? Then you can try to run that again.
Nope, is my first time running it on the early access. Do you need my API key to test? or the API key from your side cannot work as well. Let me know if you need it so i can email it to you.
Hey Jason,
Would you be able to download and locally deploy the Llama3-8b NIM? There is a section on downloading and self-hosting NIMs in the VIA DP user guide that details the steps to do this (pg. 44). Please let us know if you have any problems.
There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks