Can't run gen ai model on AGX Orin with Jetpack 6.0 GA

hi,
I’m using JetPack 6.0 GA (36.3) to run the NanoOwl on Jetson AGX Orin 32GB. (NanoOWL - NVIDIA Jetson AI Lab)
And I got the following error:

/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/torch/functional.py:507: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/pytorch/aten/src/ATen/native/TensorShape.cpp:3549.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Traceback (most recent call last):
  File "/opt/nanoowl/examples/tree_demo/tree_demo.py", line 48, in <module>
    owl_predictor=OwlPredictor(
  File "/opt/nanoowl/nanoowl/owl_predictor.py", line 171, in __init__
    image_encoder_engine = OwlPredictor.load_image_encoder_engine(image_encoder_engine, image_encoder_engine_max_batch_size)
  File "/opt/nanoowl/nanoowl/owl_predictor.py", line 382, in load_image_encoder_engine
    import tensorrt as trt
  File "/usr/lib/python3.10/dist-packages/tensorrt/__init__.py", line 67, in <module>
    from .tensorrt import *
ImportError: libnvdla_compiler.so: cannot open shared object file: No such file or directory

I had verified the NanoOwl could run on Jetpack 6.0 DP(36.2).
BTW, I found that there is no 36.3 version of container of l4t container. How can I run the NanoOwl on Jetpack 6.0 GA?

Hi @tim_lin1, I’m able to run dustynv/nanoowl:r36.2.0 on JetPack 6.0 GA / L4T R36.3.0 - those 36.2/36.3 containers are compatible, although regardless I have kicked off the build of nanoowl:r36.3.0 and will push that image as well.

Was docker working for you before with R36.2 and then you upgraded to R36.3 with apt? Have you tried using TensorRT in other containers? Can you try this -

docker run --rm --runtime nvidia \
  nvcr.io/nvidia/l4t-tensorrt:r8.6.2-runtime \
    python3 -c 'import tensorrt; print(tensorrt.__version__)'

libnvdla_compiler.so is part of the L4T drivers that get mounted into the container when --runtime nvidia is used. If you still face issue above, can you check it exists -

$ cat /etc/nvidia-container-runtime/host-files-for-container.d/drivers.csv | grep libnvdla*
lib, /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so
lib, /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_runtime.so

$ ls -ll /usr/lib/aarch64-linux-gnu/nvidia/libnvdla*
-rw-r--r-- 1 root root 8159168 Apr 24 23:05 /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so
-rw-r--r-- 1 root root 6499168 Apr 24 23:05 /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_runtime.so

Hi,
I had tried L4T tensorrt container, and I got the same error message.

==========
== CUDA ==
==========

CUDA Version 12.2.12

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.10/dist-packages/tensorrt/__init__.py", line 67, in <module>
    from .tensorrt import *
ImportError: libnvdla_compiler.so: cannot open shared object file: No such file or directory

And I try to check the drivers.csv, they all exist.

$ cat /etc/nvidia-container-runtime/host-files-for-container.d/drivers.csv | grep libnvdla*
lib, /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so
lib, /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_runtime.so

But there is no libnvdla_compiler.so.

$ ls -ll /usr/lib/aarch64-linux-gnu/nvidia/libnvdla*
-rw-r--r-- 1 root root 6499168 May  6 17:26 /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_runtime.so

How should I fix it?

Hi,

I copied the libnvdla_runtime.so from 36.2, and tensorrt container is work.

$ docker run --rm --runtime nvidia \
  nvcr.io/nvidia/l4t-tensorrt:r8.6.2-runtime \
    python3 -c 'import tensorrt; print(tensorrt.__version__)'

==========
== CUDA ==
==========

CUDA Version 12.2.12

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

8.6.2

I tried to re-flash the 36.3, but the issue is still exist.
How to make sure libnvdla_compiler.so will exist after I flash the image?

@tim_lin1 , those drivers should automatically be installed by SDK Manager after you flash the device with JetPack 6 - did that process complete without errors for you?

It would appear from your latest post, it is working as expected after you reflashed, because you are able to import tensorrt in container without issue.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.