Docker container cannot find CUDA libraries (libcurand.so.10)

camiel1 · March 22, 2023, 8:40am

Hello all,

I am trying to setup the Jetson Nano using Docker and the existing containers. I have reviewed several pages on this forum but I was not able to fix the issues I am having. I am assuming the Docker container cannot reach the CUDA libraries.

Setup:

Jetson Nano Development Kit 4 GB
Jetpack 4.6.1 [L4T 32.7.1]
NVIDIA (R) Cuda compiler driver Cuda compilation tools, release 10.2, V10.2.300

First attempt
Based on the Dockerfile of GitHub - dusty-nv/jetson-containers: Machine Learning Containers for NVIDIA Jetson and JetPack-L4T, I try to build a container from l4t-base:r32.7.1 with torch and torchvision.

FROM nvcr.io/nvidia/l4t-base:r32.7.1

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        python3-pip \
		python3-dev \
        libopenblas-base \
		libopenblas-dev \
		libopenmpi-dev \
        openmpi-bin \
        openmpi-common \
		gfortran \
		libomp-dev \
        git \
        libjpeg-dev \
        zlib1g-dev \
        libpython3-dev \
        libavcodec-dev \
        libavformat-dev \ 
        libswscale-dev \
   	    build-essential \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get clean

RUN pip3 install --upgrade pip
RUN pip3 install --no-cache-dir setuptools Cython wheel
RUN pip3 install --no-cache-dir -U jetson-stats
RUN pip3 install --no-cache-dir --verbose numpy

# PyTorch (for JetPack 4.6 DP)
ARG PYTORCH_URL=https://nvidia.box.com/shared/static/fjtbno0vpo676a25cgvuqc1wty0fkkg6.whl
ARG PYTORCH_WHL=torch-1.10.0-cp36-cp36m-linux_aarch64.whl

RUN wget --quiet --show-progress --progress=bar:force:noscroll --no-check-certificate ${PYTORCH_URL} -O ${PYTORCH_WHL} && \
    pip3 install --no-cache-dir --verbose ${PYTORCH_WHL} && \
    rm ${PYTORCH_WHL}

# torchvision 0.11.1
ARG TORCHVISION_VERSION=v0.10.0
ARG TORCH_CUDA_ARCH_LIST="5.3;6.2;7.2;8.7;10.2"
RUN printenv && echo "torchvision version = $TORCHVISION_VERSION" && echo "TORCH_CUDA_ARCH_LIST = $TORCH_CUDA_ARCH_LIST"

RUN git clone https://github.com/pytorch/vision torchvision && \
    cd torchvision && \
    git checkout ${TORCHVISION_VERSION} && \
    python3 setup.py install && \
    cd ../ && \
    rm -rf torchvision

This fails when I try to install torchvision as it cannot find libcurand.so.10

Second attempt
I use the existing torch container provided by NVIDIA:

nvcr.io/nvidia/l4t-pytorch:r32.7.1-pth1.10-py3

If I import torch there it cannot find libcurand.so.10

Another note that I found is stat GPG public key is missing in this torch container and therefore no other packages cannot be installed.

Looking forward to your reply.

AastaLLL · March 22, 2023, 8:57am

Hi,

PyTorch 1.10 should be compatible with v0.11.1 TorchVision rather than v0.10.0.
Could you update the below setting and try it again?

ARG TORCHVISION_VERSION=v0.11.1

Thanks.

camiel1 · March 22, 2023, 9:33am

Ah yes, same error though. I think it has to do something that Docker cannot find the CUDA libraries. Is there a specific way I should run my docker container?

camiel1 · March 22, 2023, 10:27am

Ok I made small progress. I have to some how add the nvidia runtime to it.

So I made this DockerFile:

FROM nvcr.io/nvidia/l4t-pytorch:r32.7.1-pth1.10-py3

ENV DEBIAN_FRONTEND=noninteractive

RUN pip3 install --upgrade pip
RUN pip3 install --no-cache-dir -U jetson-stats

sudo docker build -t container-test .
sudo docker run -it --runtime nvidia container-test

Now I can open python3 in the docker container and import torch.

So the issue is Docker related. How can I add the nvidia runtime during a build?

dusty_nv · March 22, 2023, 2:21pm

Hi @camiel1, if you set your default docker runtime to nvidia, then it will be used during build operations as well: https://github.com/dusty-nv/jetson-containers#docker-default-runtime

camiel1 · March 22, 2023, 3:25pm

Thanks for the input and your repository has been really helpful so far! I get an error now when installing torchvision. I will post more details at a later stage.

For the record I was able to complete all the steps without trying to Dockerize it.

system · April 26, 2023, 1:29am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
No CUDA runtime is found, using CUDA_HOME=‘/usr/local/cuda’ Docker and NVIDIA Docker	0	2107	February 12, 2024
Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "10.2") Jetson Nano cuda	16	9938	August 25, 2023
Importing PyTorch fails in L4T R32.3.1 Docker image on Jetson Nano after successful install Jetson Nano docker , pytorch	2	2713	October 18, 2021
Error importing torch inside a L4T container in Jetson Nano Jetson Nano cudnn	4	283	February 5, 2024
Cuda library is not found in jetson-containers docker Jetson Xavier NX cuda , docker	8	2179	February 1, 2023
Broken Jetpack installation Jetson Nano docker	4	168	June 26, 2024
Libcurand.so.10 not found on JetPack 4.6.2 in docker Jetson AGX Xavier cuda	13	2052	July 6, 2022
Docker: Error response from daemon: Unknown runtime specified nvidia Jetson Nano docker	4	3888	January 16, 2024
How to install Pytorch and torchvison on NVIDIA L4T Base from NGC catalog Jetson AGX Orin pytorch	4	1701	June 13, 2023
L4t-ml:r32.4.3-py3 Import torch error Jetson Nano docker	4	929	October 18, 2021

Docker container cannot find CUDA libraries (libcurand.so.10)

Related topics