@dusty_nv
Thank you for your advice.
I ran Optimized Multimodal Pipeline with local_llm
Then the following error was output.
Please teach me how to deal with it.
I have an ID for Hugging Face. However, I don’t know where to set this step…
echo@ubuntu:~/Desktop/work/jetson-containers$ ./run.sh $(./autotag local_llm) python3 -m local_llm --api=mlc --model liuhaotian/llava-v1.5-13b
Namespace(disable=[''], output='/tmp/autotag', packages=['local_llm'], prefer=['local', 'registry', 'build'], quiet=False, user='dustynv', verbose=False)
-- L4T_VERSION=35.3.1 JETPACK_VERSION=5.1.1 CUDA_VERSION=11.4.315
-- Finding compatible container image for ['local_llm']
Found compatible container dustynv/local_llm:r35.3.1 (2024-02-22, 8.8GB) - would you like to pull it? [Y/n] y
dustynv/local_llm:r35.3.1
+ sudo docker run --runtime nvidia -it --rm --network host --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /home/echo/Desktop/work/jetson-containers/data:/data --device /dev/snd --device /dev/bus/usb dustynv/local_llm:r35.3.1 python3 -m local_llm --api=mlc --model liuhaotian/llava-v1.5-13b
Unable to find image 'dustynv/local_llm:r35.3.1' locally
r35.3.1: Pulling from dustynv/local_llm
Digest: sha256:b4de3266c45d2e4c69d122c91502dde0a185810711803bddc0b6c048c828a6f1
Status: Downloaded newer image for dustynv/local_llm:r35.3.1
.gitattribuecho: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.52k/1.52k [00:00<00:00, 636kB/s]
generation_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 154/154 [00:00<00:00, 67.7kB/s]
config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.16k/1.16k [00:00<00:00, 73.0kB/s]
README.md: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.36k/1.36k [00:00<00:00, 656kB/s]
pytorch_model.bin.index.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 33.7k/33.7k [00:00<00:00, 2.48MB/s]
special_tokens_map.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 438/438 [00:00<00:00, 435kB/s]
tokenizer_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 749/749 [00:00<00:00, 812kB/s]
tokenizer.model: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 500k/500k [00:00<00:00, 888kB/s]
mm_projector.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 62.9M/62.9M [00:51<00:00, 1.23MB/s]
pytorch_model-00003-of-00003.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6.24G/6.24G [43:57<00:00, 2.37MB/s]
pytorch_model-00002-of-00003.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9.90G/9.90G [46:32<00:00, 3.55MB/s]
pytorch_model-00001-of-00003.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9.95G/9.95G [54:23<00:00, 3.05MB/s]
Fetching 12 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [54:24<00:00, 272.06s/it]
05:01:43 | INFO | loading /data/models/huggingface/models--liuhaotian--llava-v1.5-13b/snapshots/d64eb781be6876a5facc160ab1899281f59ef684 with MLC | 6.29G/9.95G [46:30<25:06, 2.43MB/s]
globbing /data/models/mlc/dist/models/llava-v1.5-13b/*.safetensors██████████████████████████████████████████████████████████████████████████████████| 9.95G/9.95G [54:23<00:00, 8.19MB/s]
glob []
05:01:44 | INFO | running MLC quantization:
python3 -m mlc_llm.build --model /data/models/mlc/dist/models/llava-v1.5-13b --quantization q4f16_ft --target cuda --use-cuda-graph --use-flash-attn-mqa --sep-embed --max-seq-len 4096 --artifact-path /data/models/mlc/dist
Using path "/data/models/mlc/dist/models/llava-v1.5-13b" for model "llava-v1.5-13b"
Target configured: cuda -keys=cuda,gpu -arch=sm_87 -max_num_threads=1024 -max_shared_memory_per_block=49152 -max_threads_per_block=1024 -registers_per_block=65536 -thread_warp_size=32
Automatically using target for weight quantization: cuda -keys=cuda,gpu -arch=sm_87 -max_num_threads=1024 -max_shared_memory_per_block=49152 -max_threads_per_block=1024 -registers_per_block=65536 -thread_warp_size=32
Get old param: 0%| | 0/245 [00:00<?, ?tensors/sStart computing and quantizing weights... This may take a while. | 0/407 [00:00<?, ?tensors/s]
Get old param: 99%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 242/245 [02:03<00:01, 2.19tensors/sFinish computing and quantizing weights.███████████████████████████████████████████████████████████████████████████████████████████████████████████▋| 406/407 [02:03<00:00, 7.24tensors/s]
Total param size: 6.085580825805664 GB
Start storing to cache /data/models/mlc/dist/llava-v1.5-13b-q4f16_ft/params
[0407/0407] saving param_406
All finished, 143 total shards committed, record saved to /data/models/mlc/dist/llava-v1.5-13b-q4f16_ft/params/ndarray-cache.json██████████████████| 407/407 [02:20<00:00, 7.24tensors/s]
Attempting to convert `tokenizer.model` to `tokenizer.json`.
Succesfully converted `tokenizer.model` to: /data/models/mlc/dist/llava-v1.5-13b-q4f16_ft/params/tokenizer.json
Finish exporting chat config to /data/models/mlc/dist/llava-v1.5-13b-q4f16_ft/params/mlc-chat-config.json
Save a cached module to /data/models/mlc/dist/llava-v1.5-13b-q4f16_ft/mod_cache_before_build.pkl.
Finish exporting to /data/models/mlc/dist/llava-v1.5-13b-q4f16_ft/llava-v1.5-13b-q4f16_ft-cuda.so
05:06:57 | INFO | device=cuda(0), name=Orin, compute=8.7, max_clocks=1300000, multiprocessors=16, max_thread_dims=[1024, 1024, 64], api_version=11040, driver_version=None
05:06:57 | INFO | loading llava-v1.5-13b from /data/models/mlc/dist/llava-v1.5-13b-q4f16_ft/llava-v1.5-13b-q4f16_ft-cuda.so
05:07:04 | INFO | loading openai/clip-vit-large-patch14-336
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen
httplib_response = self._make_request(
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 421, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 416, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.8/http/client.py", line 1348, in getresponse
response.begin()
File "/usr/lib/python3.8/http/client.py", line 316, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.8/http/client.py", line 277, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.8/socket.py", line 669, in readinto
return self._sock.recv_into(b)
File "/usr/lib/python3.8/ssl.py", line 1241, in recv_into
return self.read(nbyecho, buffer)
File "/usr/lib/python3.8/ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
ConnectionResetError: [Errno 104] Connection reset by peer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 719, in urlopen
retries = retries.increment(
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 400, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/lib/python3/dist-packages/six.py", line 702, in reraise
raise value.with_traceback(tb)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen
httplib_response = self._make_request(
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 421, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 416, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.8/http/client.py", line 1348, in getresponse
response.begin()
File "/usr/lib/python3.8/http/client.py", line 316, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.8/http/client.py", line 277, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.8/socket.py", line 669, in readinto
return self._sock.recv_into(b)
File "/usr/lib/python3.8/ssl.py", line 1241, in recv_into
return self.read(nbyecho, buffer)
File "/usr/lib/python3.8/ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/file_download.py", line 1238, in hf_hub_download
metadata = get_hf_file_metadata(
File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/file_download.py", line 1631, in get_hf_file_metadata
r = _request_wrapper(
File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/file_download.py", line 385, in _request_wrapper
response = _request_wrapper(
File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/file_download.py", line 408, in _request_wrapper
response = get_session().request(method=method, url=url, **params)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 535, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 648, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/utils/_http.py", line 67, in send
return super().send(request, *args, **kwargs)
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: (ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')), '(Request ID: 6d08347a-59cb-44f3-bc49-1fb01ff3a0f6)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py", line 430, in cached_file
resolved_file = hf_hub_download(
File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/file_download.py", line 1371, in hf_hub_download
raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/opt/local_llm/local_llm/__main__.py", line 22, in <module>
model = LocalLM.from_pretrained(
File "/opt/local_llm/local_llm/local_llm.py", line 80, in from_pretrained
model.init_vision()
File "/opt/local_llm/local_llm/local_llm.py", line 181, in init_vision
self.vision = CLIPImageEmbedding.from_pretrained(
File "/opt/local_llm/local_llm/vision/clip_hf.py", line 24, in from_pretrained
inst = CLIPImageEmbedding(model, dtype=dtype, **kwargs)
File "/opt/local_llm/local_llm/vision/clip_hf.py", line 42, in __init__
self.preprocessor = CLIPImageProcessor.from_pretrained(model, torch_dtype=self.dtype)#.to(self.device)
File "/usr/local/lib/python3.8/dist-packages/transformers/image_processing_utils.py", line 203, in from_pretrained
image_processor_dict, kwargs = cls.get_image_processor_dict(pretrained_model_name_or_path, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/transformers/image_processing_utils.py", line 332, in get_image_processor_dict
resolved_image_processor_file = cached_file(
File "/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py", line 470, in cached_file
raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like openai/clip-vit-large-patch14-336 is not the path to a directory containing a file named preprocessor_config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
echo@ubuntu:~/Desktop/work/jetson-containers$