Nvidia runtime fails on Jetpack 6 GA

kumarakshay · May 30, 2024, 4:38pm

I am using Auvidea’s JNX42 flashed with Jetpack6.0 GA and running into this issue when trying to launch docker with nvidia runtime

ubuntu@ubuntu:~$ docker run -it --privileged --runtime nvidia nvcr.io/nvidia/l4t-jetpack:r36.3.0
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
NvRmMemInitNvmap failed with Permission denied
356: Memory Manager Not supported



****NvRmMemMgrInit failed**** error type: 196626


libnvrm_gpu.so: NvRmGpuLibOpen failed, error=196626
NvRmMemInitNvmap failed with Permission denied
356: Memory Manager Not supported



****NvRmMemMgrInit failed**** error type: 196626


libnvrm_gpu.so: NvRmGpuLibOpen failed, error=196626
NvRmMemInitNvmap failed with Permission denied
356: Memory Manager Not supported



****NvRmMemMgrInit failed**** error type: 196626


libnvrm_gpu.so: NvRmGpuLibOpen failed, error=196626
nvidia-container-cli: initialization error: nvml error: driver not loaded: unknown.

I can confirm that /etc/nvidia-container-runtime/host-files-for-container.d/drivers.csv has the correct file paths and those compiled libraries exist on the device too.

AastaLLL · May 31, 2024, 4:47am

Hi,

Do you need to rootless mode?
Have you tried to launch the docker with sudo to see if it works?

Thanks.

kumarakshay · May 31, 2024, 5:13am

Even without the privileged mode, the problem persists. Yes, I have tried using launching the docker withsudo and the problem still persists. Given that all the libraries exists, what should I do?

AastaLLL · June 3, 2024, 7:48am

Hi,

We test the command and it works well with sudo:

$ sudo docker run -it --privileged --runtime nvidia nvcr.io/nvidia/l4t-jetpack:r36.3.0
root@808b75f8eaf2:/# nvidia-smi
Mon Jun  3 07:47:30 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 540.3.0                Driver Version: N/A          CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Orin (nvgpu)                  N/A  | N/A              N/A |                  N/A |
| N/A   N/A  N/A               N/A /  N/A | Not Supported        |     N/A          N/A |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

Thanks.

kyle.flores · June 18, 2024, 10:16pm

Hi, I have the same problem as @kumarakshay. I’m on an AGX Orin Devkit 32GB with R36.3.0, and see the same error output that was posted in the original question.

I tried the suggested command and it does not solve the problem.

$ sudo docker run -it --privileged --runtime nvidia nvcr.io/nvidia/l4t-jetpack:r36.3.0
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
NvRmMemInitNvmap failed with Permission denied
356: Memory Manager Not supported



****NvRmMemMgrInit failed**** error type: 196626


libnvrm_gpu.so: NvRmGpuLibOpen failed, error=196626
NvRmMemInitNvmap failed with Permission denied
356: Memory Manager Not supported



****NvRmMemMgrInit failed**** error type: 196626


libnvrm_gpu.so: NvRmGpuLibOpen failed, error=196626
NvRmMemInitNvmap failed with Permission denied
356: Memory Manager Not supported



****NvRmMemMgrInit failed**** error type: 196626


libnvrm_gpu.so: NvRmGpuLibOpen failed, error=196626
nvidia-container-cli: detection error: nvml error: unknown error: unknown.
ERRO[0000] error waiting for container:

Here is some additional information from the system where the problem occurs. This system was setup by SDK manager.

$ groups
<username snipped> adm cdrom sudo audio dip video plugdev render i2c lpadmin sambashare gdm docker weston-launch gpio


$ cat /etc/nv_tegra_release 
# R36 (release), REVISION: 3.0, GCID: 36191598, BOARD: generic, EABI: aarch64, DATE: Mon May  6 17:34:21 UTC 2024
# KERNEL_VARIANT: oot
TARGET_USERSPACE_LIB_DIR=nvidia
TARGET_USERSPACE_LIB_DIR_PATH=usr/lib/aarch64-linux-gnu/nvidia


$ apt list --installed |grep nvidia-container
libnvidia-container-tools/stable,now 1.14.2-1 arm64 [installed,automatic]
libnvidia-container1/stable,now 1.14.2-1 arm64 [installed,automatic]
nvidia-container-toolkit-base/stable,now 1.14.2-1 arm64 [installed,automatic]
nvidia-container-toolkit/stable,now 1.14.2-1 arm64 [installed,automatic]
nvidia-container/stable,now 6.0+b106 arm64 [installed]


$ apt list --installed |grep docker
docker.io/jammy-updates,now 24.0.7-0ubuntu2~22.04.1 arm64 [installed]

system · July 2, 2024, 10:17pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
JetPack 6.3 containerd and kubernetes Jetson AGX Orin nvbugs , containers	12	512	August 22, 2024
Problem to build a docker container and use the GPU on JETSON AGX ORIGIN Jetson AGX Orin docker	3	571	August 30, 2023
Unable to run a nvidia docker on AGX Xavier Jetson AGX Xavier docker	6	3516	October 18, 2021
Cannot start base nvidia's docker images on Jetpack 4.6.2 Jetson AGX Xavier docker	4	819	June 13, 2022
Run docker images with cuda version different from the host cuda version Jetson AGX Orin cuda , docker , deepstream	4	1054	May 28, 2024
Unable to access containers after upgrading to Jetpack 5.1.2 on Orin AGX Jetson AGX Orin containers	8	1314	October 5, 2023
Unable to run nvidia docker Jetson Xavier NX docker	4	3556	December 8, 2021
**NvRmMemInit failed** error type: 196626 NvRmMemInitNvmap failed with Permission denied Jetson AGX Xavier tensorrt , docker , deepstream	12	3445	January 25, 2023
Docker issue on permission? Jetson AGX Orin docker	5	81	November 22, 2024
Use Jetson AGX Orin’s GPU from Rootless Docker Jetson AGX Orin docker	11	790	June 6, 2024

Nvidia runtime fails on Jetpack 6 GA

Related topics