Jetson Container `Nano_llm` version 24.6-r36.2.0 error on Jepack 6.0 DP

Situation: Download nano_llm docker image on Jetson Orin Nano Developer Kit - Jetpack 6.0 [L4T 36.2.0]

dustynv/nano_llm Tags | Docker Hub

docker pull dustynv/nano_llm:24.6-r36.2.

then execute

jetson-containers run $(autotag nano_llm) python3 -m nano_llm.agents.web_chat --api=mlc --model meta-llama/Meta-Llama-3-8B-Instruct --asr=riva --tts=piper

Expectation: The necessary images with docker container started llamaspeak with text LLM and ASR/TTS enabled.

Actual: I get the error below, and i’m kicked back out to the host prompt

@dusty_nv
From call stack. It seems like this commit introduce this problem.
added whisper_trt and VAD

05:28:09 | INFO | using chat template 'llama-3' for model Meta-Llama-3-8B-Instruct
05:28:09 | INFO | model 'Meta-Llama-3-8B-Instruct', chat template 'llama-3' stop tokens:  ['<|end_of_text|>', '<|eot_id|>'] -> [128001, 128009]
05:28:09 | INFO | Warming up LLM with query 'What is 2+2?'
05:28:11 | INFO | Warmup response:  'Easy peasy!\n\nThe answer to 2+2 is... 4!<|eot_id|>'
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/NanoLLM/nano_llm/agents/web_chat.py", line 327, in <module>
    agent = WebChat(**vars(args))
  File "/opt/NanoLLM/nano_llm/agents/web_chat.py", line 32, in __init__
    super().__init__(**kwargs)
  File "/opt/NanoLLM/nano_llm/agents/voice_chat.py", line 42, in __init__
    self.vad = VADFilter(**kwargs).add(self.asr) if self.asr else None
  File "/opt/NanoLLM/nano_llm/plugins/audio/vad_filter.py", line 35, in __init__
    self.vad = load_vad()
  File "/opt/whisper_trt/whisper_trt/vad.py", line 134, in load_vad
    make_cache_dir()
  File "/opt/whisper_trt/whisper_trt/cache.py", line 36, in make_cache_dir
    os.makedirs(_CACHE_DIR)
  File "/usr/lib/python3.10/os.py", line 225, in makedirs
    mkdir(name, mode)
FileExistsError: [Errno 17] File exists: '/root/.cache/whisper_trt'

It seems like the following symbolic link cause the problem.

/root/.cache/whisper_trt → /data/models/whisper

I try to remove the whisper_trt symbolic link and it works around the problem.

root@jetbot-nx:~/.cache# ls -al
total 8
drwxr-xr-x 2 root root 4096 Jun  9 03:39 .
drwx------ 1 root root 4096 Jun  9 03:39 ..
lrwxrwxrwx 1 root root   20 Jun  9 03:39 whisper -> /data/models/whisper
lrwxrwxrwx 1 root root   20 Jun  9 03:39 whisper_trt -> /data/models/whisper

Sorry for the late response.
Is this still an issue to support? Any result can be shared?

This issue repro on most recent Nanl_llm r36.2 version. Below are the repro steps:

docker pull dustynv/nano_llm:r36.2.0

jetbot@jetbot-nx:~/dev_ws$ docker images
REPOSITORY                        TAG                  IMAGE ID       CREATED        SIZE
dustynv/nano_llm                  r36.2.0              b5df528a467a   39 hours ago   26.2GB
nvcr.io/nvidia/riva/riva-speech   2.15.1-l4t-aarch64   e783607160b9   7 weeks ago    14.5GB
dustynv/riva-client               python-r36.2.0       1c499de01dd3   3 months ago   847MB

  • start RIVA server
    (nvcr.io/nvidia/riva/riva-speech:2.15.1-l4t-aarch64)

  • start nano_llm container
    jetson-containers run dustynv/nano_llm:r36.2.0

  • run
    python3 -m nano_llm.agents.web_chat --api=mlc --model meta-llama/Meta-Llama-3-8B-Instruct --asr=riva --tts=piper

Result:
19:34:22 | INFO | Warming up LLM with query 'What is 2+2?'
19:34:24 | INFO | Warmup response:  'Easy peasy!\n\nThe answer to 2+2 is... 4!<|eot_id|>'
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/NanoLLM/nano_llm/agents/web_chat.py", line 327, in <module>
    agent = WebChat(**vars(args))
  File "/opt/NanoLLM/nano_llm/agents/web_chat.py", line 32, in __init__
    super().__init__(**kwargs)
  File "/opt/NanoLLM/nano_llm/agents/voice_chat.py", line 42, in __init__
    self.vad = VADFilter(**kwargs).add(self.asr) if self.asr else None
  File "/opt/NanoLLM/nano_llm/plugins/speech/vad_filter.py", line 46, in __init__
    self.vad = load_vad()
  File "/opt/whisper_trt/whisper_trt/vad.py", line 136, in load_vad
    make_cache_dir()
  File "/opt/whisper_trt/whisper_trt/cache.py", line 36, in make_cache_dir
    os.makedirs(_CACHE_DIR)
  File "/usr/lib/python3.10/os.py", line 225, in makedirs
    mkdir(name, mode)
FileExistsError: [Errno 17] File exists: '/root/.cache/whisper_trt'

root@jetbot-nx:~/.cache# ls -al
total 8
drwxr-xr-x 1 root root 4096 Jul  2 04:17 .
drwx------ 1 root root 4096 Jul  2 04:16 ..
lrwxrwxrwx 1 root root   17 Jul  2 04:16 clip_trt -> /data/models/clip
lrwxrwxrwx 1 root root   20 Jul  2 04:17 whisper -> /data/models/whisper
lrwxrwxrwx 1 root root   20 Jul  2 04:17 whisper_trt -> /data/models/whisper

rm whisper_trt symbolic link will fix this issue.

Note: On Jetson Orin AGX 64gb developer kit. Same setup, it run into different issue and I can’t find a work around method to unblock my work.

18:23:57 | INFO | Warming up LLM with query 'What is 2+2?'
18:23:58 | INFO | Warmup response:  'Easy peasy!\n\nThe answer to 2+2 is... 4!<|eot_id|>'
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/NanoLLM/nano_llm/agents/web_chat.py", line 327, in <module>
    agent = WebChat(**vars(args))
  File "/opt/NanoLLM/nano_llm/agents/web_chat.py", line 32, in __init__
    super().__init__(**kwargs)
  File "/opt/NanoLLM/nano_llm/agents/voice_chat.py", line 42, in __init__
    self.vad = VADFilter(**kwargs).add(self.asr) if self.asr else None
  File "/opt/NanoLLM/nano_llm/plugins/speech/vad_filter.py", line 46, in __init__
    self.vad = load_vad()
  File "/opt/whisper_trt/whisper_trt/vad.py", line 145, in load_vad
    raise RuntimeError("The MD5 Checksum for {path} does not match the expected value.")
RuntimeError: The MD5 Checksum for {path} does not match the expected value.

Thanks @jenhungho - just addressed this issue with patch for symlinked cache dir · NVIDIA-AI-IOT/whisper_trt@6414cd0 · GitHub and am rebuilding the NanoLLM container now (so do a docker pull dustynv/nano_llm:r36.2.0 later to update it)

This one I had previously patched with rolled back Silero model · NVIDIA-AI-IOT/whisper_trt@0c40ee2 · GitHub and had already rebuilt the container to include - so perhaps you weren’t running the latest build. You can also fall back to previous versions of the container if you face issues in the future that appear after updates:

https://hub.docker.com/r/dustynv/nano_llm/tags

1 Like

@dusty_nv Much appreciated. Problem solved on Orin AGX developer kit.
The “RuntimeError: The MD5 Checksum for {path} does not match the expected value.” was cause by there was a staled cached file silero_vad.onnx under ~/data/models/whisper folder. I manually delete it and then the problem solved.

jetbot@ubuntu:~/dev_ws/jetson-containers/data/models/whisper$ ls -al
total 1776
drwxrwxr-x  2 jetbot jetbot    4096 Jul  3 18:41 .
drwxrwxr-x 13 jetbot jetbot    4096 Jun 28 16:56 ..
-rw-r--r--  1 root   root   1807522 Jul  3 18:41 silero_vad.onnx