How to run the default LLaVA demo?

Hi nv team,
Now I use agx orin product with jp5.1.1.
I want to run LLaVA demo for text-generation-webui.But there is always one error:
“no module named exllama”.

I already check the model is correct.

How to fix this issue?
thanks.

the reference wiki is:

the command is:
./run.sh $(./autotag text-generation-webui) /bin/bash -c
“cd /opt/text-generation-webui && python3 server.py --listen
–model-dir=/data/models/text-generation-webui
–model=liuhaotian_llava-llama-2-13b-chat-lightning-gptq
–multimodal-pipeline=llava-llama-2-13b
–extensions=multimodal
–chat
–verbose”

the full error log is:
lcfc@lcfc-desktop:~/jetson-containers$ ./run.sh $(./autotag text-generation-webui) /bin/bash -c \

“cd /opt/text-generation-webui && python3 server.py --listen
–model-dir=/data/models/text-generation-webui
–model=liuhaotian_llava-llama-2-13b-chat-lightning-gptq
–multimodal-pipeline=llava-llama-2-13b
–extensions=multimodal
–chat
–verbose”
Namespace(disable=[‘’], output=‘/tmp/autotag’, packages=[‘text-generation-webui’], prefer=[‘local’, ‘registry’, ‘build’], quiet=False, user=‘dustynv’, verbose=False)
– L4T_VERSION=35.3.1 JETPACK_VERSION=5.1.1 CUDA_VERSION=11.4.315
– Finding compatible container image for [‘text-generation-webui’]
dustynv/text-generation-webui:r35.2.1
localuser:root being added to access control list

  • sudo docker run --runtime nvidia -it --rm --network host --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /home/lcfc/jetson-containers/data:/data --device /dev/snd --device /dev/bus/usb -e DISPLAY=:1 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth dustynv/text-generation-webui:r35.2.1 /bin/bash -c ‘cd /opt/text-generation-webui && python3 server.py --listen --model-dir=/data/models/text-generation-webui --model=liuhaotian_llava-llama-2-13b-chat-lightning-gptq --multimodal-pipeline=llava-llama-2-13b --extensions=multimodal --chat --verbose’
    2023-11-23 08:15:11 WARNING:The --chat flag has been deprecated and will be removed soon. Please remove that flag.
    2023-11-23 08:15:11 WARNING:
    You are potentially exposing the web UI to the entire internet without any access password.
    You can create one with the “–gradio-auth” flag like this:

–gradio-auth username:password

Make sure to replace username:password with your own.
bin /usr/local/lib/python3.8/dist-packages/bitsandbytes/libbitsandbytes_cuda114.so
2023-11-23 08:15:14 INFO:Loading settings from settings.json…
2023-11-23 08:15:14 INFO:Loading liuhaotian_llava-llama-2-13b-chat-lightning-gptq…
2023-11-23 08:15:14 WARNING:Exllama module failed to load. Will attempt to load from repositories.
2023-11-23 08:15:14 ERROR:Could not find repositories/exllama/. Make sure that exllama is cloned inside repositories/ and is up to date.
Traceback (most recent call last):
File “/opt/text-generation-webui/modules/exllama_hf.py”, line 14, in
from exllama.model import ExLlama, ExLlamaCache, ExLlamaConfig
ModuleNotFoundError: No module named ‘exllama’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “server.py”, line 223, in
shared.model, shared.tokenizer = load_model(model_name)
File “/opt/text-generation-webui/modules/models.py”, line 85, in load_model
output = load_func_maploader
File “/opt/text-generation-webui/modules/models.py”, line 348, in ExLlama_HF_loader
from modules.exllama_hf import ExllamaHF
File “/opt/text-generation-webui/modules/exllama_hf.py”, line 21, in
from model import ExLlama, ExLlamaCache, ExLlamaConfig
ModuleNotFoundError: No module named ‘model’

Hi @sam_yangli, sorry about that - I believe I have identified the cause of the issue. I will try it and rebuild the text-generation-webui containers here first.

ok,thanks for your reply.
once you fix this issue, please notify me in time.

OK, I pushed these updated containers with exllama v1 included again.

dustynv/text-generation-webui:r35.2.1
dustynv/text-generation-webui:r35.3.1
dustynv/text-generation-webui:r35.4.1

Please do sudo docker pull dustynv/text-generation-webui:r35.2.1 and try running the llava example again.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.