Nvidia runtime components not Detected in Docker Container on Jetson Orin NX (JetPack 6.2)

Nvidia Runtime Components Not Detected in Docker Container on Jetson Orin NX (JetPack 6.2)

Issue:
When running a TensorRT test container, I get a CUDA driver error that prevents GPU functionality. I see this issue with other samples as well, and I’d like to know if I’m missing any steps or configurations.


Command and Error Output

Run Command:

sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-tensorrt:r10.3.0-devel
/usr/src/tensorrt/bin/trtexec --onnx=/usr/src/tensorrt/data/mnist/mnist.onnx

Output:

==========  
== CUDA ==  
==========  

CUDA Version 12.6.11

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
...
WARNING: The NVIDIA Driver was not detected.  GPU functionality will not be available.

... (TensorRT initialization logs) ...

[02/25/2025-00:18:29] [I] === Device Information ===
Cuda failure: CUDA driver version is insufficient for CUDA runtime version

Environment Details

Jetson System Info:

sudo jetson_release -v
Software part of jetson-stats 4.3.1 - (c) 2024, Raffaello Bonghi
Model: NVIDIA Jetson Orin NX Engineering Reference Developer Kit - Jetpack 6.2 [L4T 36.4.3]
NV Power Mode[3]: 25W
...
Platform:
 - Machine: aarch64
 - System: Linux
 - Distribution: Ubuntu 22.04 Jammy Jellyfish
 - Release: 5.15.148-tegra
...
Libraries:
 - CUDA: 12.6.68
 - cuDNN: 9.3.0.75
 - TensorRT: 10.3.0.30
 - VPI: 3.2.4
 - Vulkan: 1.3.204
 - OpenCV: 4.5.4 - with CUDA: NO

Docker Version:

sudo docker --version
Docker version 27.5.1, build 9f9e405

Docker Info:

sudo docker info
Client: Docker Engine - Community
 Version:    27.5.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.20.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.32.4
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 6
 Server Version: 27.5.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 nvidia runc
 Default Runtime: nvidia
 Init Binary: docker-init
 containerd version: bcc810d6b9066471b0b6fa75f557a15a1cbf31bb
 runc version: v1.2.4-0-g6c52b3f
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.15.148-tegra
 Operating System: Ubuntu 22.04.5 LTS
 OSType: linux
 Architecture: aarch64
 CPUs: 8
 Total Memory: 15.29GiB
 Name: tegra
 ID: 51e2f920-f0e6-444c-9d28-e68d18ad6e36
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Docker Daemon Configuration:

cat /etc/docker/daemon.json
{
  "default-runtime": "nvidia",
  "runtimes": {
    "nvidia": {
      "args": [],
      "path": "nvidia-container-runtime"
    }
  }
}

NVIDIA Container Toolkit:

dpkg -l | grep nvidia-container-toolkit
ii  nvidia-container-toolkit               1.16.2-1   arm64   NVIDIA Container toolkit
ii  nvidia-container-toolkit-base          1.16.2-1   arm64   NVIDIA Container Toolkit Base

Summary of Steps Taken

  • Docker Downgrade:
    Used the jetsonhacks/install-docker script to downgrade Docker to v27.5.1.

  • NVIDIA Container Toolkit:
    Version 1.16.2-1 is installed and configured with the runtime flag (--runtime=nvidia --gpus all).

  • TensorRT Container:
    Running the container: nvcr.io/nvidia/l4t-tensorrt:r10.3.0-devel.

  • Error Encountered:
    The container logs warn that the NVIDIA driver was not detected and TensorRT fails with a CUDA driver version error.


Questions & Request for Help

  1. Driver Compatibility:
    My host shows NVIDIA driver version 540.4.0 (via nvidia-smi and /proc/driver/nvidia/version), which should support CUDA 12.6. Is there a known compatibility issue with JetPack 6.2 that could cause the container to not detect the driver?

  2. Container Toolkit Configuration:
    Are there additional configurations or troubleshooting steps to ensure that the NVIDIA Container Toolkit properly binds the host driver into the container?

  3. Additional Diagnostics:
    What further logs or tests should I check to diagnose why the container cannot access the NVIDIA driver even though the host driver appears correct?

I want to run CUDA, TensorRT, VPI, etc., inside containers without GPU functionality issues. Any suggestions or additional steps I might be missing would be greatly appreciated.

Thanks in advance for your help!

Hi,

Please try if the below commands helps:
These are the commands included in the NV_L4T_DOCKER_TARGET_POST_INSTALL_COMP.sh.

$ sudo nvidia-ctk runtime configure --runtime=docker
$ sudo systemctl restart docker

Thanks.

I believe those commands just set the “/etc/docker/daemon.json” file, which I included above. That was already setup the same so I didn’t see any change in behavior.

Hi,

Thanks for the feedback.

Could you try to run a CUDA-related sample outside of the container?
This helps us verify wether the issue from GPU driver or docker related.

$ git clone https://github.com/NVIDIA/cuda-samples.git
$ cd cuda-samples/
$ git checkout -b 12.5
$ cd Samples/0_Introduction/vectorAdd
$ make
$ ./vectorAdd 

Thanks.

I was able to build and run the example. Note that I had to modify the cmake file, otherwise I got the error below.

nvcc fatal : Unsupported gpu architecture ‘compute_100’

I changed

set(CMAKE_CUDA_ARCHITECTURES 50 52 60 61 70 72 75 80 86 87 89 90 100 101 120)

to

set(CMAKE_CUDA_ARCHITECTURES 50 52 60 61 70 72 75 80 86)

Build

~/cuda-samples/Samples/0_Introduction/vectorAdd$ mkdir build; cd build; cmake ..; make;
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- The CUDA compiler identification is NVIDIA 12.6.68
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found CUDAToolkit: /usr/local/cuda/include (found version "12.6.68")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Configuring done
-- Generating done
-- Build files have been written to: /home/sysop/cuda-samples/Samples/0_Introduction/vectorAdd/build
[ 33%] Building CUDA object CMakeFiles/vectorAdd.dir/vectorAdd.cu.o

[ 66%] Linking CUDA device code CMakeFiles/vectorAdd.dir/cmake_device_link.o
[100%] Linking CUDA executable vectorAdd
[100%] Built target vectorAdd

Output:

/cuda-samples/Samples/0_Introduction/vectorAdd/build$ ./vectorAdd
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

Hi,

It looks like CUDA and GPU drivers work natively.
Are you able to run nvidia-smi inside the container?

Thanks.

Running outside container:

:~$ nvidia-smi
Thu Feb 27 19:58:24 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 540.4.0                Driver Version: 540.4.0      CUDA Version: 12.6     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Orin (nvgpu)                  N/A  | N/A              N/A |                  N/A |
| N/A   N/A  N/A               N/A /  N/A | Not Supported        |     N/A          N/A |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

Running inside container:

sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-tensorrt:r10.3.0-devel

root@tegra:/# nvidia-smi
bash: nvidia-smi: command not found

Hi,

We tested the container with docker 28.0.1 on JetPack 6.2+OrinNX and it can work correctly.
Would you mind reflashing the system and try it again?
As the docker issue has been fixed, the SDKmanager can successfully set up the device without error now.

$ sudo docker run -t --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-tensorrt:r10.3.0-devel nvidia-smi

==========
== CUDA ==
==========

CUDA Version 12.6.11

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

Mon Mar  3 06:22:08 2025       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 540.4.0                Driver Version: 540.4.0      CUDA Version: 12.6     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Orin (nvgpu)                  N/A  | N/A              N/A |                  N/A |
| N/A   N/A  N/A               N/A /  N/A | Not Supported        |     N/A          N/A |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

Thanks.