How to run the default LLaVA demo?

sam_yangli · November 23, 2023, 8:39am

Hi nv team,
Now I use agx orin product with jp5.1.1.
I want to run LLaVA demo for text-generation-webui.But there is always one error:
“no module named exllama”.

I already check the model is correct.

How to fix this issue?
thanks.

the reference wiki is:

the command is:
./run.sh $(./autotag text-generation-webui) /bin/bash -c
“cd /opt/text-generation-webui && python3 server.py --listen
–model-dir=/data/models/text-generation-webui
–model=liuhaotian_llava-llama-2-13b-chat-lightning-gptq
–multimodal-pipeline=llava-llama-2-13b
–extensions=multimodal
–chat
–verbose”

the full error log is:
lcfc@lcfc-desktop:~/jetson-containers$ ./run.sh $(./autotag text-generation-webui) /bin/bash -c \

“cd /opt/text-generation-webui && python3 server.py --listen
–model-dir=/data/models/text-generation-webui
–model=liuhaotian_llava-llama-2-13b-chat-lightning-gptq
–multimodal-pipeline=llava-llama-2-13b
–extensions=multimodal
–chat
–verbose”
Namespace(disable=[‘’], output=‘/tmp/autotag’, packages=[‘text-generation-webui’], prefer=[‘local’, ‘registry’, ‘build’], quiet=False, user=‘dustynv’, verbose=False)
– L4T_VERSION=35.3.1 JETPACK_VERSION=5.1.1 CUDA_VERSION=11.4.315
– Finding compatible container image for [‘text-generation-webui’]
dustynv/text-generation-webui:r35.2.1
localuser:root being added to access control list

sudo docker run --runtime nvidia -it --rm --network host --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /home/lcfc/jetson-containers/data:/data --device /dev/snd --device /dev/bus/usb -e DISPLAY=:1 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth dustynv/text-generation-webui:r35.2.1 /bin/bash -c ‘cd /opt/text-generation-webui && python3 server.py --listen --model-dir=/data/models/text-generation-webui --model=liuhaotian_llava-llama-2-13b-chat-lightning-gptq --multimodal-pipeline=llava-llama-2-13b --extensions=multimodal --chat --verbose’
2023-11-23 08:15:11 WARNING:The --chat flag has been deprecated and will be removed soon. Please remove that flag.
2023-11-23 08:15:11 WARNING:
You are potentially exposing the web UI to the entire internet without any access password.
You can create one with the “–gradio-auth” flag like this:

–gradio-auth username:password

Make sure to replace username:password with your own.
bin /usr/local/lib/python3.8/dist-packages/bitsandbytes/libbitsandbytes_cuda114.so
2023-11-23 08:15:14 INFO:Loading settings from settings.json…
2023-11-23 08:15:14 INFO:Loading liuhaotian_llava-llama-2-13b-chat-lightning-gptq…
2023-11-23 08:15:14 WARNING:Exllama module failed to load. Will attempt to load from repositories.
2023-11-23 08:15:14 ERROR:Could not find repositories/exllama/. Make sure that exllama is cloned inside repositories/ and is up to date.
Traceback (most recent call last):
File “/opt/text-generation-webui/modules/exllama_hf.py”, line 14, in
from exllama.model import ExLlama, ExLlamaCache, ExLlamaConfig
ModuleNotFoundError: No module named ‘exllama’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “server.py”, line 223, in
shared.model, shared.tokenizer = load_model(model_name)
File “/opt/text-generation-webui/modules/models.py”, line 85, in load_model
output = load_func_maploader
File “/opt/text-generation-webui/modules/models.py”, line 348, in ExLlama_HF_loader
from modules.exllama_hf import ExllamaHF
File “/opt/text-generation-webui/modules/exllama_hf.py”, line 21, in
from model import ExLlama, ExLlamaCache, ExLlamaConfig
ModuleNotFoundError: No module named ‘model’

dusty_nv · November 26, 2023, 12:44am

Hi @sam_yangli, sorry about that - I believe I have identified the cause of the issue. I will try it and rebuild the text-generation-webui containers here first.

sam_yangli · November 27, 2023, 3:31am

ok，thanks for your reply.
once you fix this issue, please notify me in time.

dusty_nv · November 27, 2023, 3:59am

OK, I pushed these updated containers with exllama v1 included again.

dustynv/text-generation-webui:r35.2.1
dustynv/text-generation-webui:r35.3.1
dustynv/text-generation-webui:r35.4.1

Please do sudo docker pull dustynv/text-generation-webui:r35.2.1 and try running the llava example again.

system · December 20, 2023, 3:30am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Chat with Llava fails Jetson AGX Xavier generative_ai	3	20	March 6, 2025
Problems with "Tutorial - text-generation-webui" Jetson Orin Nano generative_ai	6	211	February 24, 2025
Cannot run LLaVa with Orin NX Jetson Orin NX generative_ai	7	310	August 1, 2024
I want to try LLaVa with Jetson Orin Jetson AGX Orin generative_ai	5	922	March 10, 2024
Couldn't find a compatible container for text-generation-webui Jetson AGX Orin containers , generative_ai	11	183	January 23, 2025
Can't loading "TheBloke_llava-v1.5-13B-GPTQ" with AGXorin 32GB Jetson AGX Orin generative_ai	9	124	September 10, 2024
Live Llava on Orin Jetson Projects generative_ai	20	2131	March 13, 2025
Running LLAVA live on Jetson orin nx(16 GB) with nvidia jetpack 5.1.1 Jetson Orin NX generative_ai	4	799	March 21, 2024
Jetson Orin Nano restarts while running building text-generation-webui (Loading exllamav2_ext extension (JIT)...) Jetson Orin Nano containers , generative_ai	10	68	March 12, 2025
Jetson-containers ollama Permission error after upgrade of Jetpack Jetson AGX Orin generative_ai	5	350	December 27, 2024

How to run the default LLaVA demo?

Related topics