Build docker image and build cuda dependent software on TX2

diego.avila · October 26, 2022, 4:10pm

Hello!

I am writing because I am trying the following and I cannot make it work.

First, I have a Jetson TX2 board, flashed with Jetpack 4.6.2. We want to run a docker container with a software that uses CUDA.

I have everything installed and running on the board, and the docker image I want to build, downloads and compiles the source code. The code I want to compile during the build is Ceres-Solver (2.1.0) with CUDA enabled and a software of my own. As base image I am using nvcr.io/nvidia/l4t-ml:r32.7.1-py3

However, I cannot make it build . The error happens when I build ceres-solver, it finds CUDA as installed in the system, but it cannot find cublas, cusolver or cusparse:

#38 21.81 CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
#38 21.81 Please set them or make sure they are set and tested correctly in the CMake files:
#38 21.81 CUDA_cublas_LIBRARY (ADVANCED)
#38 21.81     linked by target "ceres" in directory /dolomiti/cartographer/scripts/ceres-solver/internal/ceres
#38 21.81 CUDA_cusolver_LIBRARY (ADVANCED)
#38 21.81     linked by target "ceres" in directory /dolomiti/cartographer/scripts/ceres-solver/internal/ceres
#38 21.81 CUDA_cusparse_LIBRARY (ADVANCED)
#38 21.81     linked by target "ceres" in directory /dolomiti/cartographer/scripts/ceres-solver/internal/ceres

I found this thread CUDA driver version is insufficient for CUDA runtime version inside docker on Jetson TX2 and tried to run the base image, and compile everything from inside. Everything works wonderfully…

I am running everything on the TX2 if that makes sense.

Does anyone knows how to build the image, so compile the sofware using those libraries during build time, successfully?

Hope someone can give me a hint on how to solve this issue.

Thanks in advance.

AastaLLL · October 27, 2022, 2:46am

Hi,

Could you try if updating the /etc/docker/daemon.json help on your use case?

Thanks.

diego.avila · October 27, 2022, 6:38am

Hi,

Currently the content of the deamon.json is the following:

{
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "features": {
    	"buildkit" : true
    },
    "default-runtime": "nvidia",
    "insecure-registries": ["172.25.5.212:5000", "127.0.0.1:5000"],
    "data-root": "/home/nvidia/sd/docker-data"
}

Also, if during build (or if I run the container with --runtime=runc), and I list the files on /usr/local/cuda-10.2/lib64/, the result is:

drwxr-xr-x 1 root root   4096 Dec 15  2021 ./
drwxr-xr-x 1 root root   4096 Dec 15  2021 ../
-rw-r--r-- 1 root root 679636 Dec 15  2021 libcudadevrt.a
-rw-r--r-- 1 root root 888074 Dec 15  2021 libcudart_static.a
drwxr-xr-x 2 root root   4096 Dec 15  2021 stubs/

However, if I run with --runtime=nvidia, or without runtime flag (which is the same), listing the files on that folder gives me a lot more files. I guess the nvidia runtime just mounts the cuda folder on the container, a thing that cannot be done during build time.

I thought on using multi-staged building on docker, but there is no image CUDA-10.2-dev for arm64 architecture, or at least, I couldn’t find it…

dusty_nv · October 27, 2022, 5:20pm

Hi @diego.avila, can you try it without buildkit? Setting the default-runtime to nvidia in your daemon.json as you have done, will mean that those NVIDIA files will get mounted during docker build operations (and you will be able to use CUDA Toolkit/ect while building the container). However I’m not sure that the buildkit extensions support this.

diego.avila · October 28, 2022, 7:34am

Hi @dusty_nv, turning off buildkit seems to solve the issue.

Thank you very much.

system · November 16, 2022, 6:18am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.