Trouble running Llamaspeak on AGX Orin 64GB

System and Settings
Device: Jetson AGX Orin Developer Kit
SDK: Jetpack 5.1.2 [L4T 35.4.1]

Steps to reproduce:

  1. Follow tutorial on llamaspeak - NVIDIA Jetson AI Lab
  2. As part of tutorial, run the following commands:

jetson-containers run --env HUGGINGFACE_TOKEN=
$(autotag nano_llm)
python3 -m nano_llm.agents.web_chat --api=mlc
–model meta-llama/Meta-Llama-3-8B-Instruct
–asr=riva --tts=piper

Expected Outcome: “llamaspeak” enabled with text LLM and ASR/TTS

Actual Outcome:
System gets stuck at the steps below. I have to Ctrl+C to get back to prompt →

docker run --runtime nvidia -it --rm --network host --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /ssd/jetson-containers/data:/data --device /dev/snd --device /dev/bus/usb -e DISPLAY=:0 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth --env HUGGINGFACE_TOKEN=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 dustynv/nano_llm:r35.4.1 python3 -m nano_llm.agents.web_chat --api=mlc --model meta-llama/Meta-Llama-3-8B-Instruct --asr=riva --tts=piper
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.18) or chardet (3.0.4) doesn’t match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn’t match a supported "
/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.18) or chardet (3.0.4) doesn’t match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn’t match a supported "
/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
Process Process-1:
Traceback (most recent call last):
File “/usr/lib/python3.8/multiprocessing/process.py”, line 315, in _bootstrap
self.run()
File “/usr/lib/python3.8/multiprocessing/process.py”, line 108, in run
self._target(*self._args, **self._kwargs)
File “/opt/NanoLLM/nano_llm/plugins/process_proxy.py”, line 103, in run_process
from cuda.cudart import (
ImportError: cannot import name ‘cudaInitDevice’ from ‘cuda.cudart’ (/usr/local/lib/python3.8/dist-packages/cuda/cudart.cpython-38-aarch64-linux-gnu.so)

Hi @augustmille, I just rebuilt the container for JetPack 5, can you try pulling dustynv/nano_llm:r35.4.1 again?

Hi @dusty_nv,
Thanks for your quick response.
I saw that most of the containers of interest required Jetpack 6. Therefore, to make my life a bit easier, I updated from Jetpack 5.1.2 [L4T 35.4.1] to Jetpack 6.0-b52.
I’ll update shortly to see if the Llamaspeak - NVIDIA Jetson AI Lab tutorial worked.

Quick update (being verbose for anyone else encountering a similar issue)

  • Flashed Jetpack 6 on my AGX Orin; however, ran into too many compatibility issues that I simply did not have the bandwidth to resolve. So, went back to Jetpack 5.1.2
  • @dusty_nv With Jetpack 5.1.2, and your newly rebuilt container, and, it now works. Thanks for your help.
1 Like

my env is also:

Device: Jetson AGX Orin Developer Kit
SDK: Jetpack 5.1.2 [L4T 35.4.1]

and I use the newest docker image dustynv/nano_llm:r35.4.1(update May 2, 2024 at 12:22 pm) running the same script:

jetson-containers run --env HUGGINGFACE_TOKEN=
$(autotag nano_llm)
python3 -m nano_llm.agents.web_chat --api=mlc
–model meta-llama/Meta-Llama-3-8B-Instruct
–asr=riva --tts=piper
is OK.

But when I use this docker to run live-llava (Live LLaVA - NVIDIA Jetson AI Lab), this is a trouble, the error message look like the same:

– L4T_VERSION=35.4.1 JETPACK_VERSION=5.1.2 CUDA_VERSION=11.4
– Finding compatible container image for [‘nano_llm’]
dustynv/nano_llm:r35.4.1
localuser:root being added to access control list

  • docker run --runtime nvidia -it --rm --network host --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /home/lq/ai/code/jetson-containers/data:/data --device /dev/snd --device /dev/bus/usb -e DISPLAY=:0 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth dustynv/nano_llm:r35.4.1 python3 -m nano_llm.agents.video_query --api=mlc --model Efficient-Large-Model/VILA1.5-3b --max-context-len 256 --max-new-tokens 32 --video-input /dev/video0 --video-output webrtc://@:8554/output
    /usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.18) or chardet (3.0.4) doesn’t match a supported version!
    warnings.warn("urllib3 ({}) or chardet ({}) doesn’t match a supported "
    /usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
    warnings.warn(
    /usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.18) or chardet (3.0.4) doesn’t match a supported version!
    warnings.warn("urllib3 ({}) or chardet ({}) doesn’t match a supported "
    /usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
    warnings.warn(
    Process Process-1:
    Traceback (most recent call last):
    File “/usr/lib/python3.8/multiprocessing/process.py”, line 315, in _bootstrap
    self.run()
    File “/usr/lib/python3.8/multiprocessing/process.py”, line 108, in run
    self._target(*self._args, **self._kwargs)
    File “/opt/NanoLLM/nano_llm/plugins/process_proxy.py”, line 103, in run_process
    from cuda.cudart import (
    ImportError: cannot import name ‘cudaInitDevice’ from ‘cuda.cudart’ (/usr/local/lib/python3.8/dist-packages/cuda/cudart.cpython-38-aarch64-linux-gnu.so)

@dusty_nv Can you help me figure out where the problem is? Do I need to open a new issue?

Hi @261142960, sorry for the delay - I would recommend using JetPack 6 for the latest (which is what I list on the Live Llava page for the supported JetPack version for that)

On JetPack 5, you would need to disable the use of the ProcessProxy in the VideoQuery agent. If you want to edit the NanoLLM source, you can do this easier by cloning it to your device, and then mounting it into the container over the /opt/NanoLLM location (then it will run your local source tree instead, and you can make edits from outside the container)

Thank you very much for your response. I would like to try it on JetPack 5 first, and if it doesn’t work, I will upgrade to JetPack 6.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.