cudaErrorInsufficientDriver: CUDA driver version is insufficient for CUDA runtime version in docker container

Hi, I am build up a container for AGX Orin but see the error below.
You can reproduce it using the test case: Github

Thanks!

1: unknown file: Failure
1: C++ exception with description “std::bad_alloc: cudaErrorInsufficientDriver: CUDA driver version is insufficient for CUDA runtime version” thrown in the test body.
1:
1: [ FAILED ] CUDAfunction.test_cuMath_vec (0 ms)
1: [----------] 1 test from CUDAfunction (0 ms total)
1:
1: [----------] Global test environment tear-down
1: [==========] 1 test from 1 test suite ran. (0 ms total)
1: [ PASSED ] 0 tests.
1: [ FAILED ] 1 test, listed below:
1: [ FAILED ] CUDAfunction.test_cuMath_vec

My environment and Dockerfile:

NVRM version: NVIDIA UNIX Open Kernel Module for aarch64 540.3.0 Release Build (buildbrain@mobile-u64-6367-d8000) Mon May 6 10:21:04 PDT 2024
GCC version: collect2: error: ld returned 1 exit status

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Wed_Apr_17_19:34:47_PDT_2024
Cuda compilation tools, release 12.5, V12.5.40
Build cuda_12.5.r12.5/compiler.34177558_0

FROM nvcr.io/nvidia/l4t-cuda:12.2.12-runtime

# Install nvidia-l4t-core
RUN \
    echo "deb https://repo.download.nvidia.com/jetson/common r36.3 main" >> /etc/apt/sources.list && \
    echo "deb https://repo.download.nvidia.com/jetson/t234 r36.3 main" >> /etc/apt/sources.list && \
    apt-key adv --fetch-key http://repo.download.nvidia.com/jetson/jetson-ota-public.asc && \
    mkdir -p /opt/nvidia/l4t-packages/ && \
    touch /opt/nvidia/l4t-packages/.nv-l4t-disable-boot-fw-update-in-preinstall

RUN apt-get update \
    && echo "Y" | apt-get install -y --no-install-recommends nvidia-l4t-core

ENV UDEV=1

RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/arm64/cuda-keyring_1.1-1_all.deb \
  && dpkg -i cuda-keyring_1.1-1_all.deb \
  && apt-get update \
  && apt-get -y install cuda-toolkit-12-5

# Install necessary dependencies including gcc
RUN apt-get update \
    && apt-get install -y wget gdb build-essential git cmake libzmq3-dev pkg-config curl vim python3 python3-pip docker-compose ninja-build \
    && rm -rf /var/lib/apt/lists/*

# Install jtop 
RUN pip3 install jetson-stats

WORKDIR /

# Install GCC 12 and G++ 12
RUN apt-get update \
    && apt-get install -y software-properties-common \
    && add-apt-repository ppa:ubuntu-toolchain-r/test \
    && apt-get update \
    && apt-get install -y gcc-12 g++-12 \
    && update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 100 \
    && update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-12 100

##### Install necessary packages
COPY ./requirements.txt /
RUN pip3 install -r requirements.txt && rm -rf requirements.txt

# Install Google Test
RUN git clone https://github.com/google/googletest.git \
  && cd googletest \
  && mkdir build \
  && cd build \
  && cmake .. \
  && make -j12 \
  && make -j12 install \
  && cd ../.. \
  && rm -rf googletest

# Add lines to ~/.bashrc
RUN echo 'export PATH=/usr/local/cuda-12.5/bin:$PATH' >> ~/.bashrc \
  && echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.5/compat:$LD_LIBRARY_PATH' >> ~/.bashrc

Hi,

Could you share the below file so we can check it in our environment?

...
 => ERROR [ 8/12] COPY ./requirements.txt /                                                                                                                                             0.0s
------
 > [ 8/12] COPY ./requirements.txt /:
------
Dockerfile:41
--------------------
  39 |
  40 |     ##### Install necessary packages
  41 | >>> COPY ./requirements.txt /
  42 |     RUN pip3 install -r requirements.txt && rm -rf requirements.txt
  43 |
--------------------
ERROR: failed to solve: failed to compute cache key: failed to calculate checksum of ref 9c8f014e-a210-490c-957f-51b73f9a9929::n6r6totovdwk863t0o154y9ud: "/requirements.txt": not found

Thanks.

Hi @AastaLLL,

This is the file content. Thanks!

attrs==21.4.0
bitarray==2.4.1
crcmod==1.7
cycler==0.11.0
fonttools==4.38.0
kiwisolver==1.4.4
matplotlib==3.5.1
numpy==1.22.3
packaging==23.0
Pillow==9.4.0
plotly==5.18.0
pyparsing==3.0.9
python-dateutil==2.8.2
scipy==1.8.0
six==1.16.0
tenacity==8.2.3

Hi @AastaLLL, any update for this?

Hi,

We are still working on the testing.
Will update more info with you later.

Thanks.

Hi,

Sorry for the late update.

Please also add the cuda-compat-12-5 installation command in your Dockerfile.
To run a newer CUDA library on the older driver will require the compat lib.

You can also find the compat lib is listed in the installation command on our CUDA website:

$ …
$ sudo apt-get -y install cuda-toolkit-12-5 cuda-compat-12-5

Thanks.

Hi @AastaLLL,

I added the cuda-compat 12.5 in the part # Install CUDA Compat 12.5 but still see an error:

You can reproduce with the same test code in the github.
Thanks!

FROM nvcr.io/nvidia/l4t-cuda:12.2.12-runtime

# Install nvidia-l4t-core
RUN \
    echo "deb https://repo.download.nvidia.com/jetson/common r36.3 main" >> /etc/apt/sources.list && \
    echo "deb https://repo.download.nvidia.com/jetson/t234 r36.3 main" >> /etc/apt/sources.list && \
    apt-key adv --fetch-key http://repo.download.nvidia.com/jetson/jetson-ota-public.asc && \
    mkdir -p /opt/nvidia/l4t-packages/ && \
    touch /opt/nvidia/l4t-packages/.nv-l4t-disable-boot-fw-update-in-preinstall

RUN apt-get update \
    && echo "Y" | apt-get install -y --no-install-recommends nvidia-l4t-core

ENV UDEV=1

# Install CUDA driver 12.5
RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/arm64/cuda-keyring_1.1-1_all.deb \
  && dpkg -i cuda-keyring_1.1-1_all.deb \
  && apt-get update \
  && apt-get -y install cuda-toolkit-12-5

# Install CUDA Compat 12.5
RUN apt-get update \
  && apt-get -y install cuda-compat-12-5

# Install necessary dependencies including gcc
RUN apt-get update \
    && apt-get install -y wget gdb build-essential git cmake libzmq3-dev pkg-config curl vim python3 python3-pip docker-compose ninja-build \
    && rm -rf /var/lib/apt/lists/*

# Install jtop 
RUN pip3 install jetson-stats

WORKDIR /

# Install GCC 12 and G++ 12
RUN apt-get update \
    && apt-get install -y software-properties-common \
    && add-apt-repository ppa:ubuntu-toolchain-r/test \
    && apt-get update \
    && apt-get install -y gcc-12 g++-12 \
    && update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 100 \
    && update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-12 100

##### Install necessary packages
COPY ./requirements.txt /
RUN pip3 install -r requirements.txt && rm -rf requirements.txt

# Pull FFTW tarball
RUN wget http://www.fftw.org/fftw-3.3.10.tar.gz \
  && tar -xvzf fftw-3.3.10.tar.gz \
  && rm -rf fftw-3.3.10.tar.gz \
  && cd fftw-3.3.10 \
  && ./configure \
  && make -j12 \
  && make install

# Pull cppzmq repo
RUN git clone https://github.com/zeromq/cppzmq.git \
  && cd cppzmq \
  && mkdir build \
  && cd build \
  && cmake -DENABLE_DRAFTS=off .. \
  && make -j12 install

# Install Google Test
RUN git clone https://github.com/google/googletest.git \
  && cd googletest \
  && mkdir build \
  && cd build \
  && cmake .. \
  && make -j12 \
  && make -j12 install \
  && cd ../.. \
  && rm -rf googletest

# Add lines to ~/.bashrc
RUN echo 'export PATH=/usr/local/cuda-12.5/bin:$PATH' >> ~/.bashrc \
  && echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.5/compat:$LD_LIBRARY_PATH' >> ~/.bashrc

Hi,

Thanks for the update.

We will test this issue in our environment.
Have you checked if the sample works correctly with the CUDA 12.2?

Thanks.

Yep, the test case works well in the official image which is 12.2.

nvcr.io/nvidia/l4t-cuda:12.2.12-runtime

So, I expect if I can also download an official image which already installed 12.5

Thanks!

Hi,

We tested the Dockerfile you shared on June 11.
It can work well without any issues.

Please double-check it.

$ sudo docker build . -t tmp
$ sudo docker run -it --rm --runtime nvidia --network host tmp
# git clone https://github.com/weimin023/testcuda.git
# cd testcuda/
# cmake . && make
-- The C compiler identification is GNU 12.3.0
-- The CXX compiler identification is GNU 12.3.0
-- The CUDA compiler identification is NVIDIA 12.5.40
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda-12.5 (found suitable version "12.5", minimum required is "12.3")
-- Found GTest: /usr/local/lib/cmake/GTest/GTestConfig.cmake (found version "1.14.0")
-- Configuring done
-- Generating done
-- Build files have been written to: /testcuda
[ 50%] Building CUDA object CMakeFiles/unittest.dir/unittest.cu.o
[100%] Linking CUDA executable unittest
[100%] Built target unittest
# ./unittest
[==========] Running 2 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 1 test from CUDAfunction
[ RUN      ] CUDAfunction.test_cuMath_vec
[       OK ] CUDAfunction.test_cuMath_vec (278 ms)
[----------] 1 test from CUDAfunction (278 ms total)

[----------] 1 test from opencv
[ RUN      ] opencv.open
[       OK ] opencv.open (0 ms)
[----------] 1 test from opencv (0 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 2 test suites ran. (278 ms total)
[  PASSED  ] 2 tests.

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.