To bypass that WebRTC crash, I temporarily switched to the r36.3 container as suggested by another user, which allowed the UI to load — but now I get a new fatal runtime error(attached log run_r36.3.txt):
CUDA error: the provided PTX was compiled with an unsupported toolchain.
Fatal Python error: Aborted
...
File "/opt/NanoLLM/nano_llm/models/mlc.py", line 436 in prefill
...
When using the r36.3 container (CUDA 12.2 / TRT 8.6.2) on the r36.4.4 host (CUDA 12.6 / TRT 10.3.0), the TensorRT vision encoder runs fine, but the MLC/TVM JIT stage in mlc.py fails with the PTX toolchain error.
If I return to the r36.4.0 container, CUDA versions match (12.6), but WebRTC crashes again (“sinkpad should not be nullptr”).
The LD_LIBRARY_PATH already includes /usr/local/cuda-12.6/compat, but the error persists when mixing 12.6 host + 12.2 container.
Questions
Is this PTX error expected when mixing JetPack 6.2.1 (host r36.4 = CUDA 12.6) with an r36.3 container (CUDA 12.2)?
Does NVIDIA plan to release a fixed NanoLLM / Jetson AI Lab container for r36.4 (CUDA 12.6) that resolves the WebRTC “sinkpad nullptr” issue?
Would you recommend keeping the r36.4.x container (CUDA 12.6) and disabling WebRTC, or downgrading the host JetPack to r36.3 to match the container?
Any guidance on maintaining compatibility between JetPack 6.2.1 (r36.4.4) and the NanoLLM / OpenVLA container stack would be very helpful.
Just to clarify — I am already running with the NanoLLM r36.3.0 container, and that’s exactly where I’m hitting the PTX error. Here are the concrete details:
Host (JetPack 6.2.1 / r36.4.4): CUDA 12.6, TRT 10.3
Container (r36.3.0): CUDA 12.2, PyTorch cu122, TRT 8.6.2
Behavior: TRT-based vision encoder runs, but during MLC/TVM JIT I get: CUDA error: the provided PTX was compiled with an unsupported toolchain.
When I switch back to the r36.4.x container (so CUDA 12.6 matches the host), the PTX error goes away, but then WebRTC crashes with webrtcbin sinkpad should not be nullptr.
So it looks like:
r36.3.0 container → video OK, PTX crash (CUDA 12.2 vs 12.6 mismatch)
r36.4.x container → CUDA OK, WebRTC crash
I did update jetson-containers. If there’s a specific “validation” command/script you want me to run for the r36.3.0 OpenVLA setup, please share the exact command and I’ll post the output. Otherwise, could NVIDIA confirm the recommended combination (host/container) for OpenVLA on JetPack 6.2.1 and whether an updated r36.4.x container is planned to address the WebRTC issue?
So even with this exact configuration, the issue persists on my Jetson Orin .
Could someone from the NVIDIA engineering team please take a look or connect us to the right engineer who can help with this PTX toolchain error?
Do you mean you meet the PTX error withdustynv/openvla:r36.4.3-cu128-cp312-24.04 on JetPack 6.2.1 environment?
If so, could you run a simple CUDA sample to verify the functionality first?
I tried running CUDA as you suggested.
The host CUDA(12.6) and NanoLLM r36.4 (CUDA 12.6) work correctly,
but the CUDA inside NanoLLM r36.3 does not work properly.
In that thread, paulrrh suggested trying NanoLLM r36.3 instead.
However, when I run it with Host version: L4T 36.4.4 / CUDA 12.6 Container: NanoLLM 36.3 (CUDA 12.2),
I get a PTX error that seems to be caused by a CUDA version mismatch.
Could you please let me know how to resolve this issue?
I’d really appreciate any guidance.
It’s recommended to use a r36.4.3 image instead, as it is the latest one.
Have you tried to verify the WebRTC functionality in dustynv/openvla:r36.4.3-cu128-cp312-24.04?
My issue happens when launching Agent Studio from NanoLLM (not when running the OpenVLA image directly). I’m using:
jetson-containers run $(autotag nano_llm) \
python3 -m nano_llm.studio --load OpenVLA-MimicGen-INT4
This resolves to NanoLLM r36.4.0 on my system.
From inside that container, I checked and there is no openvla pip package nor /opt/openvla/VERSION, so it seems nano_llm.studio loads an OpenVLA-compatible backend via its own plugin/runtime rather than the dustynv/openvla Docker image.
For completeness, I also tried running:
jetson-containers run dustynv/openvla:r36.4.3-cu128-cp312-24.04
But as expected, I can’t run nano_llm.studio there because it’s an OpenVLA-only image (no NanoLLM/Studio).
Questions:
When I run nano_llm.studio (as above), which OpenVLA backend/build is actually used under the hood? Is there an official mapping for NanoLLM r36.4.0 → OpenVLA (e.g., r36.4.3-cu128-cp312-24.04), or does NanoLLM just load model weights from HuggingFace via its plugin?
Is there a supported way to pin or print the exact OpenVLA backend version/commit used by nano_llm.studio? (e.g., an env var/flag like --verbose, OPENVLA_IMAGE=..., or a recommended log command)
Given my host is JP 6.2.1 (L4T 36.4.4 / CUDA 12.6), which NanoLLM tag do you recommend to avoid the WebRTC crash while keeping Agent Studio working? A concrete command without $(autotag) would be helpful, for example:
jetson-containers run dustynv/nano_llm:<which-r36.4.x?> \
python3 -m nano_llm.studio --load OpenVLA-MimicGen-INT4
If you can point me to the exact tag/flags, I’ll run it and share back the logs.
Just double-check your question, and here are some replies for you first.
We don’t have a plan to update the NanoLLM anymore.
Given the fast pace of the LLM field, we prefer to use microservice-based frameworks, like vLLM, instead.
But r36.4 NanoLLM image is expected to work.
Instead of downgrading, we prefer to fix the WebRTC issue in the latest container.
Since downgrading might trigger other compatibility issues.
We will check what is going on in the dustynv/nano_llm:r36.4.0 and share more info with you later.
Thanks.