Hi,
I have downloaded the phi-2 model to local disk, and I tried to run NanoLLM chat using the local model path as following:
python3 -m nano_llm.chat --api mlc \
--model /root/phi-2/ \
--quantization q4f16_ft
I got the following error:
/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:127: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
warnings.warn(
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 187, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/usr/lib/python3.10/runpy.py", line 110, in _get_module_details
__import__(pkg_name)
File "/opt/NanoLLM/nano_llm/__init__.py", line 2, in <module>
from .nano_llm import NanoLLM
File "/opt/NanoLLM/nano_llm/nano_llm.py", line 14, in <module>
from .vision import CLIPVisionModel, MMProjector
File "/opt/NanoLLM/nano_llm/vision/__init__.py", line 3, in <module>
from .clip import CLIPVisionModel
File "/opt/NanoLLM/nano_llm/vision/clip.py", line 2, in <module>
from clip_trt import CLIPVisionModel
File "/opt/clip_trt/clip_trt/__init__.py", line 2, in <module>
from .text import CLIPTextModel
File "/opt/clip_trt/clip_trt/text.py", line 10, in <module>
import torch2trt
File "/usr/local/lib/python3.10/dist-packages/torch2trt/__init__.py", line 1, in <module>
from .torch2trt import *
File "/usr/local/lib/python3.10/dist-packages/torch2trt/torch2trt.py", line 2, in <module>
import tensorrt as trt
File "/usr/lib/python3.10/dist-packages/tensorrt/__init__.py", line 67, in <module>
from .tensorrt import *
ImportError: /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so: file too short
So how do I use the local path when running nanollm? Thx!