OCI Runtime error from starting GPU containers on Nvidia Jetson Xavier NX using Podman

Hello

I have to use Podman on the Nvidia Jetson Xavier NX and I am trying to pass through the onboard GPU via Nvidia Container Runtime and Hook. However, I am having issues trying to start any containers which include ubuntu arm64 and L4T.

The run command and resulting error is shown below:

$ podman run -it --security-opt=label=disable --hooks-dir=/usr/share/containers/oci/hooks.d/ nvcr.io/nvidia/l4t-tensorrt:r8.5.2-runtime

Error: OCI runtime error: error executing hook `/usr/bin/nvidia-container-runtime-hook` (exit code: 1)

The error does not provide any more detail sorry.

I have installed all packages as shown below:

$ sudo dpkg --get-selections | grep nvidia

libnvidia-container-tools			install
libnvidia-container0:arm64			install
libnvidia-container1:arm64			install
nvidia-container-runtime			install
nvidia-container-toolkit			install
nvidia-docker2					install

After I installed nvidia-container-toolkit and nvdia-container-runtime it did not install nvidia-container-runtime-hook as shown by the list above. Has this package been archived or is it meant to be installed? I manually created the oci-nvidia-hook.json file within the /oci/hooks.d/ directory.

I have used podman on a Centos 8 stream machine and managed to pass through the GPU into numerous containers. I also do not know how to create the nvidia runtime within the podman configuration file (containers.conf) [I have searched and found no implementation in podman] and am not sure if it is needed for the hook to work, as there was no nvidia runtime within the containers.conf file on the Centos 8 machine.

Any ideas on how to resolve this issue?

Cheers,
Brayth

1 Like

Hi,

We want to give it a try to get more information about the issue.
But there is no podman package available on apt.

Are you building it from the source with the instructions below?
https://podman.io/getting-started/installation#building-from-scratch

Thanks.

I did not build it from source, I installed Podman by following this Ubuntu 20.04 guide: How to Install and Use Podman on Ubuntu 20.04 | Atlantic.Net

I am using Jetpack5.1 (Ubuntu 20.04) and the apt install for Podman is only supported on 22.04 to my knowledge. However, the guide successfully installs Podman onto the Nvidia Jetson Xavier NX.

I have managed to fix the OCI runtime error and pass through the Xavier GPU into the Podman container. However, I now get a permission issue in regards to using the GPU if I start the container as a non-root user. When executing the deviceQuery sample I get the following error:

NvRmMemInitNvmap failed with Permission denied
549: Memory Manager Not supported
****NvRmMemInit failed**** error type: 196626
*** NvRmMemInit failed NvRmMemConstructor

I do not get permission denied if I run the Podman container as root, but I need to run the container as a user with non-root privileges. I have tried adding $USER to video,i2c groups and it did not solve the issue.

Are there files on the host that link to the gpu which I can change the permissions of?

Hi,

Which JetPack do you use?
There is a similar issue in JetPack 5.0DP but is fixed already.

Thanks.

I am running JetPack 5.1.

I have tried this fix with podman but it did not work. Any other ideas?

Hi,

Sorry for the late update.

After adding the $USER to video group, have you run the restart command like below?

Thanks.

yes I did, still didn’t work

Hi,

Could you try if the non-root user can work with docker?
Thanks.

Hi,
My issue is that I am trying to run the podman container as a non-root user on the host, not a non-root user within the container. eg. “podman run -it …”, not “sudo podman run -it …”.

I have ran podman as a root user using sudo and it does have the permissions to use the GPU within the container which is equivalent to docker (I have tried docker and it also worked). However, I am required to use podman as a non-root user on the host.

Hi,

Thanks for the reply.
We would need to check with our internal team about this.

Just want to confirm.
The error shows up when the container is launched or appears after running a CUDA app?
Thanks.

The error appears when I run a CUDA application within the container.

Hi,

Thanks.
Could you also try if this command can help?

Thanks.

I have tried this and it did not work.

Hi,

Sorry for the late update.

It looks like there are some extra settings required by Podman.
Have you checked the tutorial below?

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.