-- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "10.2")
CMake Warning at /usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/public/cuda.cmake:31 (message):
Caffe2: CUDA cannot be found. Depending on whether you are building Caffe2
or a Caffe2 dependent library, the next warning / error will give you more
info.
Call Stack (most recent call first):
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:88 (include)
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:18 (find_package)
CMake Error at /usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:90 (message):
Your installed Caffe2 version uses CUDA but I cannot find the CUDA
libraries. Please set the proper CUDA prefixes and / or install CUDA.
Call Stack (most recent call first):
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:18 (find_package)
-- Configuring incomplete, errors occurred!
The command '/bin/sh -c git clone https://github.com/pytorch/vision.git --branch v0.15.2 && cd vision && mkdir build && cd build && cmake .. && make -j${BUILD_CORES} && make install && cd .. && rm -rf build' returned a non-zero code: 1
I am trying to build docker container. All log I will add below this sentence.
docker build -t jetson-build:latest -f Jetson.Dockerfile --build-arg BUILD_CORES=2 .
Sending build context to Docker daemon 78.25MB
Step 1/18 : ARG BASE_IMAGE=nvcr.io/nvidia/l4t-pytorch:r32.6.1-pth1.9-py3
Step 2/18 : FROM $BASE_IMAGE as base
---> 9a936a937ee0
Step 3/18 : RUN apt-get update && apt-get upgrade -yy && apt-get install -yy wget git build-essential git clang-3.8 libpng-dev && apt remove cmake -y && pip3 install --upgrade pip && pip3 install cmake --upgrade
---> Using cache
---> 6b4b9c1e48fe
Step 4/18 : ENV GOLANG_VERSION=1.21.0 GOROOT=/usr/local/go PATH=$PATH:/usr/local/go/bin
---> Using cache
---> 94016a7375bc
Step 5/18 : RUN wget https://golang.org/dl/go${GOLANG_VERSION}.linux-arm64.tar.gz && tar -C /usr/local -xzf go${GOLANG_VERSION}.linux-arm64.tar.gz && rm go${GOLANG_VERSION}.linux-arm64.tar.gz
---> Using cache
---> 35bced8d7b56
Step 6/18 : ARG PYTHON_VERSION=3.6
---> Using cache
---> 04c2e56572e8
Step 7/18 : ENV LIBTORCH_PATH=/usr/local/lib/python${PYTHON_VERSION}/dist-packages/torch
---> Using cache
---> 98639e576c6d
Step 8/18 : ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$LIBTORCH_PATH/lib:/usr/local/lib:/app/src/libs Torch_DIR=$LIBTORCH_PATH
---> Using cache
---> c73573681888
Step 9/18 : RUN ln -s ${LIBTORCH_PATH}/include/* /usr/local/include/ && ln -s ${LIBTORCH_PATH}/lib/* /usr/local/lib/
---> Using cache
---> 19d49516a9ee
Step 10/18 : WORKDIR /app
---> Using cache
---> 7c48a054db6d
Step 11/18 : COPY go.mod go.sum ./
---> Using cache
---> 1f99a86eced9
Step 12/18 : RUN go mod download
---> Using cache
---> 755acffaff1f
Step 13/18 : ARG BUILD_CORES=2
---> Using cache
---> dff8f2c603fb
Step 14/18 : RUN cd /root/go/pkg/mod/github.com/mishabeliy15/gotorch*/cgotorch && chmod +x build.sh && ./build.sh -j${BUILD_CORES}
---> Using cache
---> 6698c92ad13f
Step 15/18 : RUN git clone https://github.com/pytorch/vision.git --branch v0.15.2 && cd vision && mkdir build && cd build && cmake .. && make -j${BUILD_CORES} && make install && cd .. && rm -rf build
---> Running in 9900e3a02d7d
Cloning into 'vision'...
Note: checking out 'fa99a5360fbcd1683311d57a76fcc0e7323a4c1e'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b <new-branch-name>
-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "10.2")
CMake Warning at /usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/public/cuda.cmake:31 (message):
Caffe2: CUDA cannot be found. Depending on whether you are building Caffe2
or a Caffe2 dependent library, the next warning / error will give you more
info.
Call Stack (most recent call first):
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:88 (include)
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:18 (find_package)
CMake Error at /usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:90 (message):
Your installed Caffe2 version uses CUDA but I cannot find the CUDA
libraries. Please set the proper CUDA prefixes and / or install CUDA.
Call Stack (most recent call first):
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:18 (find_package)
-- Configuring incomplete, errors occurred!
The command '/bin/sh -c git clone https://github.com/pytorch/vision.git --branch v0.15.2 && cd vision && mkdir build && cd build && cmake .. && make -j${BUILD_CORES} && make install && cd .. && rm -rf build' returned a non-zero code: 1
Full Docker file:
ARG BASE_IMAGE=nvcr.io/nvidia/l4t-pytorch:r32.6.1-pth1.9-py3
FROM $BASE_IMAGE as base
RUN apt-get update && apt-get upgrade -yy && apt-get install -yy \
wget \
git \
build-essential \
git \
clang-3.8 \
libpng-dev && \
apt remove cmake -y && \
pip3 install --upgrade pip && \
pip3 install cmake --upgrade
# libgeos-dev \ ToDo: need to install later
# libproj-dev
# Donwload and install Go
ENV GOLANG_VERSION=1.21.0 \
GOROOT=/usr/local/go \
PATH=$PATH:/usr/local/go/bin
RUN wget https://golang.org/dl/go${GOLANG_VERSION}.linux-arm64.tar.gz && \
tar -C /usr/local -xzf go${GOLANG_VERSION}.linux-arm64.tar.gz && \
rm go${GOLANG_VERSION}.linux-arm64.tar.gz
# Set ENV var for libtorch
ARG PYTHON_VERSION=3.6
ENV LIBTORCH_PATH=/usr/local/lib/python${PYTHON_VERSION}/dist-packages/torch \
CUDA_HOME=/usr/local/cuda-10.2
ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_HOME/lib64:$LIBTORCH_PATH/lib:/usr/local/lib:/app/src/libs \
Torch_DIR=$LIBTORCH_PATH \
PATH=/usr/local/cuda-10.2/bin:$PATH \
CUDA_BIN_PATH=/usr/local/cuda-10.2/bin
# Create symlinks libtorch to system
RUN ln -s ${LIBTORCH_PATH}/include/* /usr/local/include/ && \
ln -s ${LIBTORCH_PATH}/lib/* /usr/local/lib/ && \
ln -s /usr/local/cuda-10.2 /usr/local/cuda
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
ARG BUILD_CORES=2
# Build gotorch
RUN cd /root/go/pkg/mod/github.com/mishabeliy15/gotorch*/cgotorch && \
chmod +x build.sh && \
./build.sh -j${BUILD_CORES}
# Download and build libtorchvision
RUN git clone https://github.com/pytorch/vision.git --branch v0.15.2 && \
cd vision && \
mkdir build && \
cd build && \
cmake .. && \
make -j${BUILD_CORES} && \
make install && \
cd .. && rm -rf build
FROM base as dev
WORKDIR /app/src
CMD ["bash"]
Torch version:
pip3 show torch
Name: torch
Version: 1.13.0a0+git7c98e70
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: /usr/local/lib/python3.8/dist-packages
Requires: typing-extensions
Required-by: torchaudio, torchvision
Cuda version:
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_28_22:34:44_PST_2021
Cuda compilation tools, release 10.2, V10.2.300
Build cuda_10.2_r440.TC440_70.29663091_0
If you’re on JetPack 4 and need the CUDA/cuDNN/TensorRT headers/libraries while building container, you need to set that so that the NVIDIA Container Runtime gets used during docker build operations. This way, CUDA/cuDNN/TensorRT will be mounted into your build context (on JetPack 4, those CUDA libraries get mounted from the host)
sudo docker info | grep 'Default Runtime'
Default Runtime: nvidia
But I have already got the same error:
make build-jetson-docker
docker build -t jetson-build:latest -f Jetson.Dockerfile .
Sending build context to Docker daemon 78.25MB
Step 1/18 : ARG BASE_IMAGE=nvcr.io/nvidia/l4t-pytorch:r32.6.1-pth1.9-py3
Step 2/18 : FROM $BASE_IMAGE as base
---> 9a936a937ee0
Step 3/18 : RUN apt-get update && apt-get upgrade -yy && apt-get install -yy wget git build-essential git clang-3.8 libpng-dev && apt remove cmake -y && pip3 install --upgrade pip && pip3 install cmake --upgrade
---> Using cache
---> 6b4b9c1e48fe
Step 4/18 : ENV GOLANG_VERSION=1.21.0 GOROOT=/usr/local/go PATH=$PATH:/usr/local/go/bin
---> Using cache
---> 94016a7375bc
Step 5/18 : RUN wget https://golang.org/dl/go${GOLANG_VERSION}.linux-arm64.tar.gz && tar -C /usr/local -xzf go${GOLANG_VERSION}.linux-arm64.tar.gz && rm go${GOLANG_VERSION}.linux-arm64.tar.gz
---> Using cache
---> 35bced8d7b56
Step 6/18 : ARG PYTHON_VERSION=3.6
---> Using cache
---> 04c2e56572e8
Step 7/18 : ENV LIBTORCH_PATH=/usr/local/lib/python${PYTHON_VERSION}/dist-packages/torch
---> Using cache
---> 98639e576c6d
Step 8/18 : ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$LIBTORCH_PATH/lib:/usr/local/lib:/app/src/libs Torch_DIR=$LIBTORCH_PATH
---> Using cache
---> c73573681888
Step 9/18 : RUN ln -s ${LIBTORCH_PATH}/include/* /usr/local/include/ && ln -s ${LIBTORCH_PATH}/lib/* /usr/local/lib/
---> Using cache
---> 19d49516a9ee
Step 10/18 : WORKDIR /app
---> Using cache
---> 7c48a054db6d
Step 11/18 : COPY go.mod go.sum ./
---> Using cache
---> 1f99a86eced9
Step 12/18 : RUN go mod download
---> Using cache
---> 755acffaff1f
Step 13/18 : ARG BUILD_CORES=2
---> Using cache
---> dff8f2c603fb
Step 14/18 : RUN cd /root/go/pkg/mod/github.com/mishabeliy15/gotorch*/cgotorch && chmod +x build.sh && ./build.sh -j${BUILD_CORES}
---> Using cache
---> 6698c92ad13f
Step 15/18 : RUN git clone https://github.com/pytorch/vision.git --branch v0.15.2 && cd vision && mkdir build && cd build && cmake .. && make -j${BUILD_CORES} && make install && cd .. && rm -rf build
---> Running in a00bf4f36a00
Cloning into 'vision'...
Note: checking out 'fa99a5360fbcd1683311d57a76fcc0e7323a4c1e'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b <new-branch-name>
-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "10.2")
CMake Warning at /usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/public/cuda.cmake:31 (message):
Caffe2: CUDA cannot be found. Depending on whether you are building Caffe2
or a Caffe2 dependent library, the next warning / error will give you more
info.
Call Stack (most recent call first):
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:88 (include)
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:18 (find_package)
CMake Error at /usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:90 (message):
Your installed Caffe2 version uses CUDA but I cannot find the CUDA
libraries. Please set the proper CUDA prefixes and / or install CUDA.
Call Stack (most recent call first):
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:18 (find_package)
-- Configuring incomplete, errors occurred!
The command '/bin/sh -c git clone https://github.com/pytorch/vision.git --branch v0.15.2 && cd vision && mkdir build && cd build && cmake .. && make -j${BUILD_CORES} && make install && cd .. && rm -rf build' returned a non-zero code: 1
make: *** [Makefile:111: build-jetson-docker] Error 1
@s.pinchuk do you actually have the full CUDA Toolkit installed on your device, outside of container?
# Set ENV var for libtorch
ARG PYTHON_VERSION=3.6
ENV LIBTORCH_PATH=/usr/local/lib/python${PYTHON_VERSION}/dist-packages/torch \
CUDA_HOME=/usr/local/cuda-10.2
ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_HOME/lib64:$LIBTORCH_PATH/lib:/usr/local/lib:/app/src/libs \
Torch_DIR=$LIBTORCH_PATH \
PATH=/usr/local/cuda-10.2/bin:$PATH \
CUDA_BIN_PATH=/usr/local/cuda-10.2/bin
# Create symlinks libtorch to system
RUN ln -s ${LIBTORCH_PATH}/include/* /usr/local/include/ && \
ln -s ${LIBTORCH_PATH}/lib/* /usr/local/lib/ && \
ln -s /usr/local/cuda-10.2 /usr/local/cuda
Try removing ln -s /usr/local/cuda-10.2 /usr/local/cuda → you are overwriting the symlink to /usr/local/cuda that the NVIDIA Container Runtime mounts. Also, try referencing /usr/local/cuda in your paths instead of /usr/local/cuda-10.2. There are lots of custom steps you are performing here that I would try removing until it works.
cat Jetson.Dockerfile
ARG BASE_IMAGE=nvcr.io/nvidia/l4t-pytorch:r32.6.1-pth1.9-py3
FROM $BASE_IMAGE as base
RUN apt-get update && apt-get upgrade -yy && apt-get install -yy \
wget \
git \
build-essential \
git \
clang-3.8 \
libpng-dev && \
apt remove cmake -y && \
pip3 install --upgrade pip && \
pip3 install cmake --upgrade
# libgeos-dev \ ToDo: need to install later
# libproj-dev
# Donwload and install Go
ENV GOLANG_VERSION=1.21.0 \
GOROOT=/usr/local/go \
PATH=$PATH:/usr/local/go/bin
RUN wget https://golang.org/dl/go${GOLANG_VERSION}.linux-arm64.tar.gz && \
tar -C /usr/local -xzf go${GOLANG_VERSION}.linux-arm64.tar.gz && \
rm go${GOLANG_VERSION}.linux-arm64.tar.gz
# Set ENV var for libtorch
ARG PYTHON_VERSION=3.6
ENV LIBTORCH_PATH=/usr/local/lib/python${PYTHON_VERSION}/dist-packages/torch \
CUDA_HOME=/usr/local/cuda-10.2
ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_HOME/lib64:$LIBTORCH_PATH/lib:/usr/local/lib:/app/src/libs \
Torch_DIR=$LIBTORCH_PATH \
PATH=/usr/local/cuda-10.2/bin:$PATH \
CUDA_BIN_PATH=/usr/local/cuda-10.2/bin
# Create symlinks libtorch to system
RUN ln -s ${LIBTORCH_PATH}/include/* /usr/local/include/ && \
ln -s ${LIBTORCH_PATH}/lib/* /usr/local/lib/
#&& \
# ln -s /usr/local/cuda-10.2 /usr/local/cuda
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
ARG BUILD_CORES=2
# Build gotorch
RUN cd /root/go/pkg/mod/github.com/mishabeliy15/gotorch*/cgotorch && \
chmod +x build.sh
#&& \
# ./build.sh -j${BUILD_CORES}
# Download and build libtorchvision
RUN git clone https://github.com/pytorch/vision.git --branch v0.15.2 && \
cd vision && \
mkdir build
# && \
# cd build && \
# cmake .. && \
# make -j${BUILD_CORES} && \
# make install && \
# cd .. && rm -rf build
FROM base as dev
WORKDIR /app/src
CMD ["bash"]
jetson@nano:~/p
But I have the same error, when we build torch in docker container:
root@a0d109cc52a5:~/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0# cd cgotorch/
root@a0d109cc52a5:~/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch# ./build.sh
+ set -e
++ uname
+ OS=Linux
+++ dirname ./build.sh
++ realpath .
+ BASE_DIR=/root/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch
+ BUILD_DIR=/root/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch/build
+ rm -rf /root/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch/build
+ mkdir -p /root/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch/build
+ cd /root/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch/build
+ cmake ..
-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Warning (dev) at CMakeLists.txt:8 (find_package):
Policy CMP0146 is not set: The FindCUDA module is removed. Run "cmake
--help-policy CMP0146" for policy details. Use the cmake_policy command to
set the policy and suppress this warning.
This warning is for project developers. Use -Wno-dev to suppress it.
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "10.2")
CMake Warning at /usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/public/cuda.cmake:31 (message):
Caffe2: CUDA cannot be found. Depending on whether you are building Caffe2
or a Caffe2 dependent library, the next warning / error will give you more
info.
Call Stack (most recent call first):
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:88 (include)
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:15 (find_package)
CMake Error at /usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:90 (message):
Your installed Caffe2 version uses CUDA but I cannot find the CUDA
libraries. Please set the proper CUDA prefixes and / or install CUDA.
Call Stack (most recent call first):
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:15 (find_package)
-- Configuring incomplete, errors occurred!
root@a0d109cc52a5:~/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch# cd /app/sr
bash: cd: /app/sr: No such file or directory
root@a0d109cc52a5:~/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch# cd /app/src
root@a0d109cc52a5:/app/src# cd /go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch#
bash: cd: /go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch#: No such file or directory
root@a0d109cc52a5:/app/src# cd ~/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch#
bash: cd: /root/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch#: No such file or directory
Also I tried to remove package build and create it clean one more time, but result is the same.
Steps of commands, which I did:
rm -rf build
mkdir build
cd build
cmake -DCUDA_DIR=/usr/local/cuda ..
Logs in this case:
root@a0d109cc52a5:~/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch# ls
CMakeLists.txt cgotorch.h device.cc functional.h ivalue.cc memory.h pickle.cc tensor.h torchdef.h
build cuda.cc device.h init.cc ivalue.h optim.cc pickle.h torch.cc torchscript.cc
build.sh cuda.h functional.cc init.h memory.cc optim.h tensor.cc torch.h torchscript.h
root@a0d109cc52a5:~/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch# ls
CMakeLists.txt cgotorch.h device.cc functional.h ivalue.cc memory.h pickle.cc tensor.h torchdef.h
build cuda.cc device.h init.cc ivalue.h optim.cc pickle.h torch.cc torchscript.cc
build.sh cuda.h functional.cc init.h memory.cc optim.h tensor.cc torch.h torchscript.h
root@a0d109cc52a5:~/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch# rm -rf build
root@a0d109cc52a5:~/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch# mkdir build
root@a0d109cc52a5:~/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch# cd build
root@a0d109cc52a5:~/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch/build# cmake -DCUDA_DIR=/usr/local/cuda ..
-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Warning (dev) at CMakeLists.txt:8 (find_package):
Policy CMP0146 is not set: The FindCUDA module is removed. Run "cmake
--help-policy CMP0146" for policy details. Use the cmake_policy command to
set the policy and suppress this warning.
This warning is for project developers. Use -Wno-dev to suppress it.
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "10.2")
CMake Warning at /usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/public/cuda.cmake:31 (message):
Caffe2: CUDA cannot be found. Depending on whether you are building Caffe2
or a Caffe2 dependent library, the next warning / error will give you more
info.
Call Stack (most recent call first):
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:88 (include)
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:15 (find_package)
CMake Error at /usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:90 (message):
Your installed Caffe2 version uses CUDA but I cannot find the CUDA
libraries. Please set the proper CUDA prefixes and / or install CUDA.
Call Stack (most recent call first):
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:15 (find_package)
-- Configuring incomplete, errors occurred!
root@a0d109cc52a5:~/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch/build# cmake -DCUDA_DIR=/usr/local/cuda-10-2 ..
CMake Warning (dev) at CMakeLists.txt:8 (find_package):
Policy CMP0146 is not set: The FindCUDA module is removed. Run "cmake
--help-policy CMP0146" for policy details. Use the cmake_policy command to
set the policy and suppress this warning.
This warning is for project developers. Use -Wno-dev to suppress it.
-- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "10.2")
CMake Warning at /usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/public/cuda.cmake:31 (message):
Caffe2: CUDA cannot be found. Depending on whether you are building Caffe2
or a Caffe2 dependent library, the next warning / error will give you more
info.
Call Stack (most recent call first):
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:88 (include)
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:15 (find_package)
CMake Error at /usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:90 (message):
Your installed Caffe2 version uses CUDA but I cannot find the CUDA
libraries. Please set the proper CUDA prefixes and / or install CUDA.
Call Stack (most recent call first):
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:15 (find_package)
-- Configuring incomplete, errors occurred!
root@a0d109cc52a5:~/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230822142026-b89c6923d2d0/cgotorch/build# cmake -DCUDA_DIR=/usr/local/cuda-11-4 ..
CMake Warning (dev) at CMakeLists.txt:8 (find_package):
Policy CMP0146 is not set: The FindCUDA module is removed. Run "cmake
--help-policy CMP0146" for policy details. Use the cmake_policy command to
set the policy and suppress this warning.
This warning is for project developers. Use -Wno-dev to suppress it.
-- Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "10.2")
CMake Warning at /usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/public/cuda.cmake:31 (message):
Caffe2: CUDA cannot be found. Depending on whether you are building Caffe2
or a Caffe2 dependent library, the next warning / error will give you more
info.
Call Stack (most recent call first):
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:88 (include)
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:15 (find_package)
CMake Error at /usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:90 (message):
Your installed Caffe2 version uses CUDA but I cannot find the CUDA
libraries. Please set the proper CUDA prefixes and / or install CUDA.
Call Stack (most recent call first):
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:15 (find_package)
-- Configuring incomplete, errors occurred!
root@e580222041cd:/# cp -r /usr/local/cuda/samples /cuda_samples
cp: cannot stat '/usr/local/cuda/samples': No such file or directory
root@e580222041cd:/# cd /cuda_samples/1_Utilities/deviceQuery
bash: cd: /cuda_samples/1_Utilities/deviceQuery: No such file or directory
@s.pinchuk outside of container, do you have the CUDA Toolkit under /usr/local/cuda and the samples in /usr/local/cuda/samples on your device?
I think your issues may be related to either you not having CUDA Toolkit installed on your device, or your NVIDIA Container Runtime misconfigured. The SD card images comes with all of this installed and already working, so before sinking a lot of time debugging it, I would recommend flashing a fresh SD card and seeing if it works for you then.
jetson@nano:/usr/local/cuda$ ls -a
. .. bin doc EULA.txt extras include lib64 nvml nvvm nvvmx samples share targets tools version.json version.txt
jetson@nano:/usr/local/cuda$ cd samples/
jetson@nano:/usr/local/cuda/samples$ ls -a
. .. 0_Simple 1_Utilities 2_Graphics 3_Imaging 4_Finance 5_Simulations 6_Advanced 7_CUDALibraries common EULA.txt Makefile
jetson@nano:/usr/local/cuda/samples$
Perhaps it is then related to your re-installations of CUDA. What I was recommending was re-flashing your SD card, because these Docker<->CUDA mounting issues on JetPack 4 can be tricky to debug why it’s not working.
STEP_1. We want to see does pytorch works in docker container:
docker run -it --rm --runtime nvidia nvcr.io/nvidia/l4t-pytorch:r32.6.1-pth1.9-py3
Unable to find image 'nvcr.io/nvidia/l4t-pytorch:r32.6.1-pth1.9-py3' locally
r32.6.1-pth1.9-py3: Pulling from nvidia/l4t-pytorch
e47a8a86d66c: Pull complete
bdce3430dad6: Pull complete
c26a6b81c746: Pull complete
a70635e646de: Pull complete
24cbe60e3161: Pull complete
c7f64cc97a39: Pull complete
8777adb92eda: Pull complete
542a24b3572f: Pull complete
3d5ade2b8849: Pull complete
3e865584f789: Pull complete
4811af6cacf1: Pull complete
db645757aac7: Pull complete
0105dc23dc93: Pull complete
c8aff3baf097: Pull complete
5f50b3e38480: Pull complete
b5418d85b074: Pull complete
78d4b308041f: Pull complete
44f9a409dddf: Pull complete
dc19965f7530: Pull complete
73f02f5e17e7: Pull complete
de4b3c1acb9b: Pull complete
6c2b9910111b: Pull complete
1cb44dee771f: Pull complete
bf00de567184: Pull complete
b73a0bf964ae: Pull complete
579ce47ae238: Pull complete
2262c28018a2: Pull complete
177925b6dcf7: Pull complete
Digest: sha256:247550bf8554d2f4669aa9fe0e9d319ff2c3aaf67275e6259346d6e34186a623
Status: Downloaded newer image for nvcr.io/nvidia/l4t-pytorch:r32.6.1-pth1.9-py3
root@229a1ace600d:/# python3
Python 3.6.9 (default, Jan 26 2021, 15:33:00)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 196, in <module>
_load_global_deps()
File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 149, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory
>>>
So it doesn’t work in container
STEP_2. We want to see does pytorch work on Jetson Nano 4GB:
jetson@nano:~$ python3
Python 3.8.10 (default, May 26 2023, 14:05:08)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
>>>
Also I have one more trouble. CUDA doesn’t work on C++ lib torch. But python works:
jetson@nano:~$ python3
Python 3.8.10 (default, May 26 2023, 14:05:08)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
>>>
STEP_1. Write architecture 53 in CMAKE file for Jetson Nano 4GB version:
cat CMakeLists.txt
cmake_minimum_required(VERSION 3.12)
project(tcuda)
set(CMAKE_CXX_STANDARD 14)
set(CMAKE_CXX_STANDARD_REQUIRED True)
# Check for CUDA if necessary
# find_package(CUDA QUIET)
if(CMAKE_CUDA_COMPILER)
enable_language(CUDA)
add_definitions(-DUSE_CUDA)
set(CUDA_FLAGS ${CUDA_LIBRARIES})
set(CMAKE_CUDA_ARCHITECTURES 53)
endif()
# Locate LibTorch
find_package(Torch REQUIRED)
include_directories(${TORCH_INCLUDE_DIRS})
# Collect all source files
file(GLOB SOURCES "*.cpp")
add_executable(tcuda ${SOURCES})
# Link against LibTorch
target_link_libraries(tcuda ${TORCH_LIBRARIES})
if(CUDA_FOUND)
target_link_libraries(tcuda ${CUDA_LIBRARIES})
endif()
STEP_3. Run command sudo CUDACXX=/usr/local/cuda/bin/nvcc cmake -DCMAKE_PREFIX_PATH=/usr/local/lib/python3.8/dist-packages/torch ..
sudo CUDACXX=/usr/local/cuda/bin/nvcc cmake -DCMAKE_PREFIX_PATH=/usr/local/lib/python3.8/dist-packages/torch ..
-- The CUDA compiler identification is NVIDIA 10.2.300
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found CUDA: /usr/local/cuda (found version "10.2")
-- Caffe2: CUDA detected: 10.2
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 10.2
-- Found CUDNN: /usr/lib/aarch64-linux-gnu/libcudnn.so
-- Found cuDNN: v8.2.1 (include: /usr/include, library: /usr/lib/aarch64-linux-gnu/libcudnn.so)
-- /usr/local/cuda/lib64/libnvrtc.so shorthash is 7d272a04
-- Autodetected CUDA architecture(s): 5.3
-- Added CUDA NVCC flags for: -gencode;arch=compute_53,code=sm_53
CMake Warning at /usr/local/lib/python3.8/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
/usr/local/lib/python3.8/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:127 (append_torchlib_if_found)
CMakeLists.txt:17 (find_package)
-- Found Torch: /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch.so
-- Configuring done (9.3s)
-- Generating done (0.0s)
-- Build files have been written to: /home/jetson/go/pkg/mod/github.com/mishabeliy15/gotorch@v0.0.0-20230825120526-26c4d596dbac/test_cuda/build
STEP_4. Build
sudo make -j2
[ 50%] Building CXX object CMakeFiles/tcuda.dir/main.cpp.o
[100%] Linking CXX executable tcuda
[100%] Built target tcuda
STEP_5.
Run tcuda:
./tcuda
Is Cuda Available: 0
0.5559 0.4669 0.0445
0.7814 0.0697 0.8526
[ CPUFloatType{2,3} ]
Outside of container, do you have /usr/local/cuda/lib64/libcurand.so.10 ?
$ ls -ll /usr/local/cuda/lib64/libcurand*
lrwxrwxrwx 1 root root 15 Mar 1 2021 /usr/local/cuda/lib64/libcurand.so -> libcurand.so.10
lrwxrwxrwx 1 root root 23 Mar 1 2021 /usr/local/cuda/lib64/libcurand.so.10 -> libcurand.so.10.1.2.300
-rw-r--r-- 1 root root 62698584 Mar 1 2021 /usr/local/cuda/lib64/libcurand.so.10.1.2.300
-rw-r--r-- 1 root root 62767380 Mar 1 2021 /usr/local/cuda/lib64/libcurand_static.a
We’ve established earlier in this thread that something’s funky with your NVIDIA Container Runtime and it doesn’t seem to be properly mounting the CUDA drivers/libraries into the containers correctly. Due to your unknown environment and changes, my recommendation was to reflash the SD card and then confirm that GPU is working in l4t-base container with the CUDA deviceQuery/vectorAdd samples. Have you done that?
STEP_3. Run command sudo CUDACXX=/usr/local/cuda/bin/nvcc cmake -DCMAKE_PREFIX_PATH=/usr/local/lib/python3.8/dist-packages/torch ..
You are on Jetson Nano and JetPack 4 (which uses Python 3.6), but these paths/commands show Python 3.8. I’m unsure which PyTorch wheel that is using or about the tests with libtorch, sorry.