Trt_pose model in docker: ImportError: libnvmedia_tensor.so: cannot open shared object file: No such file or directory

Hello,

I am trying to make trt_pose model (NVIDIA-AI-IOT/trt_pose: Real-time pose estimation accelerated with NVIDIA TensorRT (github.com) work inside a docker container on Jetson Nano.

I build the image as described here: nvidia / container-images / l4t-jetpack · GitLab.

Then I use this command to get into the container:

sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-jetpack:r35.1.0

It works, builds and runs as it should. Inside the docker container, I clone the trt_pose repo and follow the instructions provided in the description. Than I try to install dependencies using:

sudo python3 setup.py install --plugins

, but it throws me an error:

  File "setup.py", line 2, in <module>
    import tensorrt
  File "/usr/lib/python3.8/dist-packages/tensorrt/__init__.py", line 68, in <module>
    from .tensorrt import *
ImportError: libnvmedia_tensor.so: cannot open shared object file: No such file or directory

I know that I am supposed to have tensorrt also installed on host machine (jetson nano in this case) and that my problem is related to this issue: Issue with tensorrt:r8.2.1 l4t container, Import error libnvmedia.so: cannot open shared object file: - Jetson & Embedded Systems / Jetson TX1 - NVIDIA Developer Forums. However, I was unable to get it work. I obviously need to map something else to the container?

Thank you in advance. Any help will be much appreciated!

Hi @user156593, this container is for JetPack 5, whereas you are running JetPack 4 on your Nano, so it won’t work correctly with GPU acceleration - instead, please use the l4t-base container that is compatible with the version of JetPack-L4T that you are running (you can check your L4T version with cat /etc/nv_tegra_release)

Yes, on JetPack 4 you need CUDA/cuDNN/TensorRT installed onto your device, because they get mounted into your container dynamically when --runtime nvidia is used (and hence l4t-base will have all the JetPack components present). On JetPack 5, it moved to having CUDA/cuDNN/TensorRT installed into the containers themselves (and hence a l4t-jetpack container was created for this purpose)

Hello and thank you for your quick answer!

I saw it already 2 days ago, but didn’t want to answer so the topic wouldn’t be closed. Indeed as you said the image was incorrect. I will answer below, also for others, what my steps were to run trt_pose model inside a container on Jatson nano (NOTE: still not perfect!).

I checked my l4t version using cat /etc/nv_tegra_release which printed: R32 (release), REVISION: 7.1. Then I checked on this page: NVIDIA L4T ML | NVIDIA NGC to see which version of ml image I need to use - I, therefore, needed JetPack 4.6.1 (L4T R32.7.1).

Then I created Dockerfile with name Dockerfile.jetson:

ARG TAG
FROM nvcr.io/nvidia/l4t-ml:${TAG}

# Install any utils needed for execution
RUN apt-get update \
	&& apt-get install -y --no-install-recommends sudo \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get clean

RUN sudo pip3 install tqdm cython pycocotools \
	&& sudo apt-get install python3-matplotlib

WORKDIR /app


RUN if [ -d /app/torch2trt ]; then \
	echo "torch2trt already cloned."; \
	else \
	git clone --depth 1 https://github.com/NVIDIA-AI-IOT/torch2trt \
	&& cd torch2trt \
	&& sudo python3 setup.py install --plugins; \
	fi


RUN if [ -d /app/trt_pose ]; then \
	echo "trt_pose already cloned."; \
	else \
	git clone --depth 1 https://github.com/NVIDIA-AI-IOT/trt_pose \
	&& cd trt_pose \
	&& sudo python3 setup.py install; \
	fi

and Makefile (TAG adjusted based on version of l4t):

TAG     ?= r32.7.1-py3
L4T_JETPACK_REGISTRY ?= "nvcr.io/nvidia/l4t-ml"

image:
	docker build -t $(L4T_JETPACK_REGISTRY):$(TAG) \
		--build-arg "TAG=$(TAG)" \
		-f ./Dockerfile.jetson .

run:
	docker run -it --rm --runtime nvidia --network host \
	-e DISPLAY=$(DISPLAY) \
	-v /tmp/.X11-unix/:/tmp/.X11-unix $(L4T_JETPACK_REGISTRY):$(TAG)

all:
	sudo make image ; sudo make run

Then I call sudo make all which pull the base image, builds and runs the container. Everything looks fine and I am able to run the model with 8 FPS - less than the stated 22 by using appended script in the repo: trt_pose/tasks/human_pose/live_demo.ipynb.

I think something is still wrong, because during the execution of line:

model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)

is throwing the error [E] 3: [builderConfig.cpp::canRunOnDLA::382] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builderConfig.cpp::canRunOnDLA::382, condition: dlaEngineCount > 0 - while the original torch model is being optimized.

Also, the line where the model should be saved:
torch.Savemodel_trt.state_dict(), OPTIMIZED_MODEL)

throws the error:

Traceback (most recent call last):
  File "trt.py", line 74, in <module>
    torch.save(model_trt.state_dict(), OPTIMIZED_MODEL)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 379, in save
    _save(obj, opened_zipfile, pickle_module, pickle_protocol)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 484, in _save
    pickler. Dump(obj)
MemoryError

Any idea how I should tackle this? Thank you in advance!

I’m not familiar with the trt_pose code itself (you may want to file an issue against the trt_pose GitHub), but is your Nano running in 5W or 10W mode? (you can check this with sudo nvpmodel -q and set it to 10W mode with sudo nvpmodel -m 0)

Also, these trt_pose models are integrated with jetson-inference so you could try that as well: https://github.com/dusty-nv/jetson-inference/blob/master/docs/posenet.md

Hello,

thank you. I will definitely check jetson-inference lib. It is a pity that I even struggled with this one, since it seem obsolete.

Anyways, the output of sudo nvpmodel -q gives me:

NVPM WARN: fan mode is not set!
NV Power Mode: MAXN
0

I assume I am already over 9000 :P

Haha okay, yes you are already in maximum power mode.

GitHub thread for cross-reference:

Thank you! Answer on Github.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.