Trouble running Llamaspeak on AGX Orin 64GB

augustmille · May 1, 2024, 7:49pm

System and Settings
Device: Jetson AGX Orin Developer Kit
SDK: Jetpack 5.1.2 [L4T 35.4.1]

Steps to reproduce:

Follow tutorial on llamaspeak - NVIDIA Jetson AI Lab
As part of tutorial, run the following commands:

jetson-containers run --env HUGGINGFACE_TOKEN=
$(autotag nano_llm)
python3 -m nano_llm.agents.web_chat --api=mlc
–model meta-llama/Meta-Llama-3-8B-Instruct
–asr=riva --tts=piper

Expected Outcome: “llamaspeak” enabled with text LLM and ASR/TTS

Actual Outcome:
System gets stuck at the steps below. I have to Ctrl+C to get back to prompt →

docker run --runtime nvidia -it --rm --network host --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /ssd/jetson-containers/data:/data --device /dev/snd --device /dev/bus/usb -e DISPLAY=:0 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth --env HUGGINGFACE_TOKEN=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 dustynv/nano_llm:r35.4.1 python3 -m nano_llm.agents.web_chat --api=mlc --model meta-llama/Meta-Llama-3-8B-Instruct --asr=riva --tts=piper
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.18) or chardet (3.0.4) doesn’t match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn’t match a supported "
/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.18) or chardet (3.0.4) doesn’t match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn’t match a supported "
/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
Process Process-1:
Traceback (most recent call last):
File “/usr/lib/python3.8/multiprocessing/process.py”, line 315, in _bootstrap
self.run()
File “/usr/lib/python3.8/multiprocessing/process.py”, line 108, in run
self._target(*self._args, **self._kwargs)
File “/opt/NanoLLM/nano_llm/plugins/process_proxy.py”, line 103, in run_process
from cuda.cudart import (
ImportError: cannot import name ‘cudaInitDevice’ from ‘cuda.cudart’ (/usr/local/lib/python3.8/dist-packages/cuda/cudart.cpython-38-aarch64-linux-gnu.so)

dusty_nv · May 2, 2024, 4:23am

Hi @augustmille, I just rebuilt the container for JetPack 5, can you try pulling dustynv/nano_llm:r35.4.1 again?

augustmille · May 2, 2024, 3:20pm

Hi @dusty_nv,
Thanks for your quick response.
I saw that most of the containers of interest required Jetpack 6. Therefore, to make my life a bit easier, I updated from Jetpack 5.1.2 [L4T 35.4.1] to Jetpack 6.0-b52.
I’ll update shortly to see if the Llamaspeak - NVIDIA Jetson AI Lab tutorial worked.

augustmille · May 4, 2024, 2:44pm

Quick update (being verbose for anyone else encountering a similar issue)

Flashed Jetpack 6 on my AGX Orin; however, ran into too many compatibility issues that I simply did not have the bandwidth to resolve. So, went back to Jetpack 5.1.2
@dusty_nv With Jetpack 5.1.2, and your newly rebuilt container, and, it now works. Thanks for your help.

261142960 · May 9, 2024, 10:08pm

my env is also:

Device: Jetson AGX Orin Developer Kit
SDK: Jetpack 5.1.2 [L4T 35.4.1]

and I use the newest docker image dustynv/nano_llm:r35.4.1(update May 2, 2024 at 12:22 pm) running the same script:

jetson-containers run --env HUGGINGFACE_TOKEN=
$(autotag nano_llm)
python3 -m nano_llm.agents.web_chat --api=mlc
–model meta-llama/Meta-Llama-3-8B-Instruct
–asr=riva --tts=piper
is OK.

But when I use this docker to run live-llava (Live LLaVA - NVIDIA Jetson AI Lab), this is a trouble, the error message look like the same:

– L4T_VERSION=35.4.1 JETPACK_VERSION=5.1.2 CUDA_VERSION=11.4
– Finding compatible container image for [‘nano_llm’]
dustynv/nano_llm:r35.4.1
localuser:root being added to access control list

docker run --runtime nvidia -it --rm --network host --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /home/lq/ai/code/jetson-containers/data:/data --device /dev/snd --device /dev/bus/usb -e DISPLAY=:0 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth dustynv/nano_llm:r35.4.1 python3 -m nano_llm.agents.video_query --api=mlc --model Efficient-Large-Model/VILA1.5-3b --max-context-len 256 --max-new-tokens 32 --video-input /dev/video0 --video-output webrtc://@:8554/output
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.18) or chardet (3.0.4) doesn’t match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn’t match a supported "
/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.18) or chardet (3.0.4) doesn’t match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn’t match a supported "
/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
Process Process-1:
Traceback (most recent call last):
File “/usr/lib/python3.8/multiprocessing/process.py”, line 315, in _bootstrap
self.run()
File “/usr/lib/python3.8/multiprocessing/process.py”, line 108, in run
self._target(*self._args, **self._kwargs)
File “/opt/NanoLLM/nano_llm/plugins/process_proxy.py”, line 103, in run_process
from cuda.cudart import (
ImportError: cannot import name ‘cudaInitDevice’ from ‘cuda.cudart’ (/usr/local/lib/python3.8/dist-packages/cuda/cudart.cpython-38-aarch64-linux-gnu.so)

261142960 · May 9, 2024, 10:17pm

@dusty_nv Can you help me figure out where the problem is? Do I need to open a new issue?

dusty_nv · May 10, 2024, 9:47pm

Hi @261142960, sorry for the delay - I would recommend using JetPack 6 for the latest (which is what I list on the Live Llava page for the supported JetPack version for that)

On JetPack 5, you would need to disable the use of the ProcessProxy in the VideoQuery agent. If you want to edit the NanoLLM source, you can do this easier by cloning it to your device, and then mounting it into the container over the /opt/NanoLLM location (then it will run your local source tree instead, and you can make edits from outside the container)

261142960 · May 11, 2024, 2:24am

Thank you very much for your response. I would like to try it on JetPack 5 first, and if it doesn’t work, I will upgrade to JetPack 6.

system · May 25, 2024, 2:24am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Can't run llamaspeak Jetson AGX Orin generative_ai	12	440	July 7, 2024
Jetson Container `Nano_llm` version 24.6-r36.2.0 error on Jepack 6.0 DP Jetson Orin NX containers , generative_ai	5	219	July 4, 2024
Llamacpp compile failed on Jetson Orin Nano (8GB) Jetson Orin Nano generative_ai , llama	5	152	January 13, 2025
Unable to access containers after upgrading to Jetpack 5.1.2 on Orin AGX Jetson AGX Orin containers	8	1330	October 5, 2023
Jetson-containers ollama Permission error after upgrade of Jetpack Jetson AGX Orin generative_ai	5	184	December 27, 2024
Webrtc issues with jetson-utils Jetson AGX Orin kernel	5	512	February 23, 2024
nanoLLM on Docker Jetson Orin Nano generative_ai	10	112	September 10, 2024
Function calling on Jetson Orin - Unable to run `bot_functions` in the latest dustynv/nano_llm docker container (dustynv/nano_llm:24.8-r36.3.0) Jetson AGX Orin generative_ai , llm , llama	5	97	October 22, 2024
TensorRT-LLM for jetson errors Jetson AGX Orin generative_ai , paligemma , kosmos-2 , llama	14	274	January 16, 2025
Upgrading CUDA for Autoware Compatibility and tensorrt libs not Accessible Inside the l4t-jetpack DRIVE AGX Orin General driveos-cuda	10	820	January 22, 2024

Trouble running Llamaspeak on AGX Orin 64GB

Related topics