Setup info
• Hardware Platform (Jetson / GPU) Jetson AGX Orin
• DeepStream Version 6.2 (inside of docker)
• JetPack Version (valid for Jetson only) 5.1.1 (L4T 35.3.1)
Problem
I try to build a docker image with OpenCV supporting CUDA and Gstreamer.
But I am unable to do it, because during compilation I get info:
CMake Warning at cmake/OpenCVFindLibsPerf.cmake:45 (message):
OpenCV is not able to find/configure CUDA SDK (required by WITH_CUDA).
CUDA support will be disabled in OpenCV build.
To eliminate this warning remove WITH_CUDA=ON CMake configuration option.
Setup and reproduction
1. Create Dockerfile:
FROM nvcr.io/nvidia/deepstream-l4t:6.2-samples
# Usefull tools, apps and libraries
RUN apt-get update && apt install -y \
g++ gcc git automake \
ffmpeg wget sudo htop x11-apps nano xonsh \
libopenmpt-dev python3-pip python3-gi
RUN useradd -ms /bin/bash -G sudo,audio,video,render myuser && echo "myuser ALL=(ALL) NOPASSWD: ALL" > /etc/sudoers.d/ubuntu
ENV TERM=xterm-256color
USER myuser
WORKDIR /home/myuser
RUN wget https://forums.developer.nvidia.com/uploads/short-url/3kLERQgB4ZR0q0wgUdO9qY6lxBq.sh -O install_opencv4.6.0_Jetson.sh
RUN sudo chmod +x install_opencv4.6.0_Jetson.sh
# To check whether CUDA libs are availavle
RUN ls -1 /usr/local/ | grep cuda-
RUN echo yes | ./install_opencv4.6.0_Jetson.sh
RUN python3 -c "import cv2; print(cv2.getBuildInformation())" | grep CUDA
It uses OpenCV script provided here: Compiling OpenCV on Jetpack 5 - #5 by AastaLLL
2. Edit /etc/docker/daemon.json
to be sure that docker builder uses nvidia Runtime
{
"runtimes": {
"nvidia": {
"args": [],
"path": "/usr/bin/nvidia-container-runtime"
}
},
"default-runtime": "nvidia" ,
}
This step is based on:
Hello!
I am writing because I am trying the following and I cannot make it work.
First, I have a Jetson TX2 board, flashed with Jetpack 4.6.2. We want to run a docker container with a software that uses CUDA.
I have everything installed and running on the board, and the docker image I want to build, downloads and compiles the source code. The code I want to compile during the build is Ceres-Solver (2.1.0) with CUDA enabled and a software of my own. As base image I am using nvcr.io/nvidia/l4t-…
and
opened 02:26PM - 23 Sep 21 UTC
I'm trying to have nvidia driver available during build which works with the def… ault `build` command but not when using buildkit.
I have this minimal `Dockerfile`
```dockerfile
FROM nvidia/cuda:11.1-base
RUN ls /dev/nvidia*
RUN nvidia-smi
```
Which I can build as follow:
```
❯ docker build -f Dockerfile . --no-cache
Sending build context to Docker daemon 5.167kB
Step 1/3 : FROM nvidia/cuda:11.1-base
---> 287475453634
Step 2/3 : RUN ls /dev/nvidia*
---> Running in e8c12b8a398b
/dev/nvidia-uvm
/dev/nvidia-uvm-tools
/dev/nvidia0
/dev/nvidiactl
Removing intermediate container e8c12b8a398b
---> 17884a1f0b6a
Step 3/3 : RUN nvidia-smi
---> Running in 4a5bbd2337c0
Thu Sep 23 14:08:59 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.119.03 Driver Version: 450.119.03 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla K80 On | 00000000:00:1E.0 Off | 0 |
| N/A 35C P8 29W / 149W | 0MiB / 11441MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Removing intermediate container 4a5bbd2337c0
---> 308dfa443901
Successfully built 308dfa443901
```
But when using buildkit I get
```
❯ DOCKER_BUILDKIT=1 docker build -f Dockerfile .
[+] Building 0.4s (5/6)
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 116B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/nvidia/cuda:11.1-base 0.0s
=> CACHED [1/3] FROM docker.io/nvidia/cuda:11.1-base 0.0s
=> ERROR [2/3] RUN ls /dev/nvidia* 0.3s
------
> [2/3] RUN ls /dev/nvidia*:
NVIDIA/nvidia-container-runtime#5 0.299 ls: cannot access '/dev/nvidia*': No such file or directory
------
executor failed running [/bin/sh -c ls /dev/nvidia*]: exit code: 2
```
Then I figured out I have to use `RUN --security=insecure` and use `docker buildx` as follows
```dockerfile
# Dockerfile.buildkit
# syntax = docker/dockerfile:experimental
FROM nvidia/cuda:11.1-base
RUN --security=insecure nvidia-smi
```
I create the builder
```
❯ docker buildx create --driver docker-container --name local --buildkitd-flags '--allow-insecure-entitlement security.insecure' --use
local
```
then I build the image as follows
```
❯ docker buildx build -f Dockerfile.buildkit . --allow security.insecure
WARN[0000] No output specified for docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
[+] Building 0.7s (9/9) FINISHED
=> [internal] load build definition from Dockerfile.buildkit 0.0s
=> => transferring dockerfile: 150B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> resolve image config for docker.io/docker/dockerfile:experimental 0.1s
=> CACHED docker-image://docker.io/docker/dockerfile:experimental@sha256:600e5c62eedff338b3f7a0850beb7c05866e0ef27b2d2e8c02aa468e78496ff5 0.0s
=> => resolve docker.io/docker/dockerfile:experimental@sha256:600e5c62eedff338b3f7a0850beb7c05866e0ef27b2d2e8c02aa468e78496ff5 0.0s
=> [internal] load .dockerignore 0.0s
=> [internal] load build definition from Dockerfile.buildkit 0.0s
=> [internal] load metadata for docker.io/nvidia/cuda:11.1-base 0.1s
=> CACHED [1/2] FROM docker.io/nvidia/cuda:11.1-base@sha256:c6bb47a62ad020638aeaf66443de9c53c6dc8a0376e97b2d053ac774560bd0fa 0.0s
=> => resolve docker.io/nvidia/cuda:11.1-base@sha256:c6bb47a62ad020638aeaf66443de9c53c6dc8a0376e97b2d053ac774560bd0fa 0.0s
=> ERROR [2/2] RUN --security=insecure nvidia-smi 0.1s
------
> [2/2] RUN --security=insecure nvidia-smi:
NVIDIA/nvidia-container-runtime#8 0.074 /bin/sh: 1: nvidia-smi: not found
------
Dockerfile.buildkit:3
--------------------
1 | # syntax = docker/dockerfile:experimental
2 | FROM nvidia/cuda:11.1-base
3 | >>> RUN --security=insecure nvidia-smi
4 |
--------------------
error: failed to solve: process "/bin/sh -c nvidia-smi" did not complete successfully: exit code: 127
```
I know the insecure flag works when I use another command that requires privilege (i.e.: `mount --bind /dev /tmp`)
This is my `daemon.json`
```json
{
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
},
"default-runtime": "nvidia"
}
```
The output of `docker info`
```
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Build with BuildKit (Docker Inc., v0.6.1-docker)
scan: Docker Scan (Docker Inc., v0.8.0)
Server:
Containers: 10
Running: 1
Paused: 0
Stopped: 9
Images: 191
Server Version: 20.10.8
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc io.containerd.runc.v2 io.containerd.runtime.v1.linux nvidia
Default Runtime: nvidia
Init Binary: docker-init
containerd version: e25210fe30a0a703442421b0f60afac609f950a3
runc version: v1.0.1-0-g4144b63
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.4.0-1056-aws
Operating System: Ubuntu 18.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 59.86GiB
Name: ip-172-31-13-189
ID: JSBF:DURT:RVBM:P7XL:YIWL:IKJU:3WIS:25N6:UH72:ALJA:XDRO:R35Q
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
```
The output of `nvidia-container-cli -k -d /dev/tty info`
```
-- WARNING, the following logs are for debugging purposes only --
I0923 14:17:11.705485 3265 nvc.c:372] initializing library context (version=1.4.0, build=704a698b7a0ceec07a48e56c37365c741718c2df)
I0923 14:17:11.705537 3265 nvc.c:346] using root /
I0923 14:17:11.705556 3265 nvc.c:347] using ldcache /etc/ld.so.cache
I0923 14:17:11.705571 3265 nvc.c:348] using unprivileged user 1000:1000
I0923 14:17:11.705600 3265 nvc.c:389] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0923 14:17:11.705791 3265 nvc.c:391] dxcore initialization failed, continuing assuming a non-WSL environment
W0923 14:17:11.711421 3266 nvc.c:269] failed to set inheritable capabilities
W0923 14:17:11.711475 3266 nvc.c:270] skipping kernel modules load due to failure
I0923 14:17:11.711714 3267 driver.c:101] starting driver service
I0923 14:17:11.715296 3265 nvc_info.c:676] requesting driver information with ''
I0923 14:17:11.717376 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/vdpau/libvdpau_nvidia.so.450.119.03
I0923 14:17:11.717652 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.450.119.03
I0923 14:17:11.717748 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.450.119.03
I0923 14:17:11.717819 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.450.119.03
I0923 14:17:11.717897 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.450.119.03
I0923 14:17:11.718008 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.450.119.03
I0923 14:17:11.718111 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.450.119.03
I0923 14:17:11.718180 3265 nvc_info.c:171] skipping /usr/lib/x86_64-linux-gnu/libnvidia-nscq-dcgm.so.450.51.06
I0923 14:17:11.718256 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.450.119.03
I0923 14:17:11.718322 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.450.119.03
I0923 14:17:11.718430 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.450.119.03
I0923 14:17:11.718548 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.450.119.03
I0923 14:17:11.718625 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.450.119.03
I0923 14:17:11.718713 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.450.119.03
I0923 14:17:11.718794 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.450.119.03
I0923 14:17:11.718902 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.450.119.03
I0923 14:17:11.719021 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.450.119.03
I0923 14:17:11.719108 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.450.119.03
I0923 14:17:11.719184 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.450.119.03
I0923 14:17:11.719293 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cbl.so.450.119.03
I0923 14:17:11.719381 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.450.119.03
I0923 14:17:11.719472 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.450.119.03
I0923 14:17:11.719897 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.450.119.03
I0923 14:17:11.720133 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.450.119.03
I0923 14:17:11.720217 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.450.119.03
I0923 14:17:11.720300 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.450.119.03
I0923 14:17:11.720380 3265 nvc_info.c:169] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.450.119.03
W0923 14:17:11.720453 3265 nvc_info.c:350] missing library libnvidia-nscq.so
W0923 14:17:11.720469 3265 nvc_info.c:350] missing library libnvidia-fatbinaryloader.so
W0923 14:17:11.720491 3265 nvc_info.c:354] missing compat32 library libnvidia-ml.so
W0923 14:17:11.720501 3265 nvc_info.c:354] missing compat32 library libnvidia-cfg.so
W0923 14:17:11.720516 3265 nvc_info.c:354] missing compat32 library libnvidia-nscq.so
W0923 14:17:11.720534 3265 nvc_info.c:354] missing compat32 library libcuda.so
W0923 14:17:11.720540 3265 nvc_info.c:354] missing compat32 library libnvidia-opencl.so
W0923 14:17:11.720547 3265 nvc_info.c:354] missing compat32 library libnvidia-ptxjitcompiler.so
W0923 14:17:11.720561 3265 nvc_info.c:354] missing compat32 library libnvidia-fatbinaryloader.so
W0923 14:17:11.720567 3265 nvc_info.c:354] missing compat32 library libnvidia-allocator.so
W0923 14:17:11.720582 3265 nvc_info.c:354] missing compat32 library libnvidia-compiler.so
W0923 14:17:11.720600 3265 nvc_info.c:354] missing compat32 library libnvidia-ngx.so
W0923 14:17:11.720612 3265 nvc_info.c:354] missing compat32 library libvdpau_nvidia.so
W0923 14:17:11.720628 3265 nvc_info.c:354] missing compat32 library libnvidia-encode.so
W0923 14:17:11.720642 3265 nvc_info.c:354] missing compat32 library libnvidia-opticalflow.so
W0923 14:17:11.720658 3265 nvc_info.c:354] missing compat32 library libnvcuvid.so
W0923 14:17:11.720667 3265 nvc_info.c:354] missing compat32 library libnvidia-eglcore.so
W0923 14:17:11.720674 3265 nvc_info.c:354] missing compat32 library libnvidia-glcore.so
W0923 14:17:11.720686 3265 nvc_info.c:354] missing compat32 library libnvidia-tls.so
W0923 14:17:11.720693 3265 nvc_info.c:354] missing compat32 library libnvidia-glsi.so
W0923 14:17:11.720710 3265 nvc_info.c:354] missing compat32 library libnvidia-fbc.so
W0923 14:17:11.720725 3265 nvc_info.c:354] missing compat32 library libnvidia-ifr.so
W0923 14:17:11.720740 3265 nvc_info.c:354] missing compat32 library libnvidia-rtcore.so
W0923 14:17:11.720755 3265 nvc_info.c:354] missing compat32 library libnvoptix.so
W0923 14:17:11.720774 3265 nvc_info.c:354] missing compat32 library libGLX_nvidia.so
W0923 14:17:11.720792 3265 nvc_info.c:354] missing compat32 library libEGL_nvidia.so
W0923 14:17:11.720799 3265 nvc_info.c:354] missing compat32 library libGLESv2_nvidia.so
W0923 14:17:11.720807 3265 nvc_info.c:354] missing compat32 library libGLESv1_CM_nvidia.so
W0923 14:17:11.720818 3265 nvc_info.c:354] missing compat32 library libnvidia-glvkspirv.so
W0923 14:17:11.720823 3265 nvc_info.c:354] missing compat32 library libnvidia-cbl.so
I0923 14:17:11.721761 3265 nvc_info.c:276] selecting /usr/bin/nvidia-smi
I0923 14:17:11.721799 3265 nvc_info.c:276] selecting /usr/bin/nvidia-debugdump
I0923 14:17:11.721837 3265 nvc_info.c:276] selecting /usr/bin/nvidia-persistenced
I0923 14:17:11.721887 3265 nvc_info.c:276] selecting /usr/bin/nv-fabricmanager
I0923 14:17:11.721930 3265 nvc_info.c:276] selecting /usr/bin/nvidia-cuda-mps-control
I0923 14:17:11.721972 3265 nvc_info.c:276] selecting /usr/bin/nvidia-cuda-mps-server
I0923 14:17:11.722021 3265 nvc_info.c:438] listing device /dev/nvidiactl
I0923 14:17:11.722043 3265 nvc_info.c:438] listing device /dev/nvidia-uvm
I0923 14:17:11.722049 3265 nvc_info.c:438] listing device /dev/nvidia-uvm-tools
I0923 14:17:11.722056 3265 nvc_info.c:438] listing device /dev/nvidia-modeset
W0923 14:17:11.722103 3265 nvc_info.c:321] missing ipc /var/run/nvidia-persistenced/socket
W0923 14:17:11.722155 3265 nvc_info.c:321] missing ipc /var/run/nvidia-fabricmanager/socket
W0923 14:17:11.722193 3265 nvc_info.c:321] missing ipc /tmp/nvidia-mps
I0923 14:17:11.722213 3265 nvc_info.c:733] requesting device information with ''
I0923 14:17:11.729066 3265 nvc_info.c:623] listing device /dev/nvidia0 (GPU-f82fe76f-d403-d34b-8b80-9a9316b19b18 at 00000000:00:1e.0)
NVRM version: 450.119.03
CUDA version: 11.0
Device Index: 0
Device Minor: 0
Model: Tesla K80
Brand: Tesla
GPU UUID: GPU-f82fe76f-d403-d34b-8b80-9a9316b19b18
Bus Location: 00000000:00:1e.0
Architecture: 3.7
I0923 14:17:11.729156 3265 nvc.c:423] shutting down library context
I0923 14:17:11.729661 3267 driver.c:163] terminating driver service
I0923 14:17:11.730042 3265 driver.c:203] driver service terminated successfully
```
Looks like the builder is not using the nvidia runtime. What am I missing?
and
docker, nvidia-docker
3. Reboot machine
4. Build docker
with command:
DOCKER_BUILDKIT=0 docker build .
5. Observe output
Cuda libraries check
DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
BuildKit is currently disabled; enable it by removing the DOCKER_BUILDKIT=0
environment-variable.
Sending build context to Docker daemon 54.73MB
////Not important lines
---> bb4a6c3df745
Step 10/12 : RUN ls /usr/local/cuda-11.4/lib64
---> Running in 676c9518f1ea
libcublas.so libnppidei.so
libcublas.so.11 libnppidei.so.11
libcublas.so.11.6.6.84 libnppidei.so.11.4.0.287
libcublasLt.so.11 libnppif.so
libcublasLt.so.11.6.6.84 libnppif.so.11
libcudart.so libnppif.so.11.4.0.287
////CUDA libs are available
Check compilation
------------------------------------
** Build opencv 4.6.0 (3/4)
------------------------------------
-- The CXX compiler identification is GNU 9.4.0
-- The C compiler identification is GNU 9.4.0
////Not important lines
-- Performing Test HAVE_CXX_WNO_CLASS_MEMACCESS - Success
CMake Warning at cmake/OpenCVFindLibsPerf.cmake:45 (message):
OpenCV is not able to find/configure CUDA SDK (required by WITH_CUDA).
CUDA support will be disabled in OpenCV build.
To eliminate this warning remove WITH_CUDA=ON CMake configuration option.
Call Stack (most recent call first):
CMakeLists.txt:733 (include)
-- Could not find OpenBLAS include. Turning OpenBLAS_FOUND off
-- Could not find OpenBLAS lib. Turning OpenBLAS_FOUND off
////Not important lines
-- General configuration for OpenCV 4.6.0 =====================================
-- Version control: unknown
--
-- Extra modules:
-- Location (extra): /home/myuser/workspace/opencv_contrib-4.6.0/modules
-- Version control (extra): unknown
--
-- Platform:
-- Timestamp: 2024-01-10T15:56:47Z
-- Host: Linux 5.10.104-tegra aarch64
-- CMake: 3.16.3
-- CMake generator: Unix Makefiles
-- CMake build tool: /usr/bin/make
-- Configuration: RELEASE
....................
-- GStreamer: YES (1.16.3)
-- v4l/v4l2: YES (linux/videodev2.h)
--
-- Parallel framework: pthreads
--
-- Trace: YES (with Intel ITT)
--
-- Other third-party libraries:
-- Lapack: NO
-- Eigen: NO
-- Custom HAL: YES (carotene (ver 0.0.1))
-- Protobuf: build (3.19.1)
--
-- NVIDIA CUDA: NO
--
-- cuDNN: NO
--
-- OpenCL: YES (no extra features)
Watch the last build step (and fail):
** Install opencv 4.6.0 successfully
** Bye :)
Removing intermediate container 0b34e1f472f2
---> 5ea8722e2133
Step 12/12 : RUN python3 -c "import cv2; print(cv2.getBuildInformation())" | grep CUDA
---> Running in 49b928daaae4
Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'cv2'
The command '/bin/sh -c python3 -c "import cv2; print(cv2.getBuildInformation())" | grep CUDA' returned a non-zero code: 1