Libtorch CUDA Problem

Using Jetpack 4.4 and installed the Pytorch wheel from here PyTorch for Jetson.

But torch::cuda::cudnn_is_available() is showing false.

Hi,

For JetPack4.4 produce release, please install pyTorch v1.6.0.

JetPack 4.4 production release (L4T R32.4.3)

  • Python 3.6 - torch-1.6.0-cp36-cp36m-linux_aarch64.whl
  • The JetPack 4.4 production release (L4T R32.4.3) only supports PyTorch 1.6.0 or newer, due to updates in cuDNN.
  • This wheel of the PyTorch 1.6.0 final release replaces the previous wheel of PyTorch 1.6.0-rc2.

Thanks.

After installing this, the problem is still there.
But torch::cuda::cudnn_is_available() is showing false inside my C++ code.

Strangely, using Python3 import torch, its says cuda available is true.

My makefile is using the Python3 to get location of Libtorch.

Hi,

We are going to reproduce this issue and update more information later.
Thanks.

Hi,

We test the torch-1.6.0-cp36-cp36m-linux_aarch64.whl file and l4t-pytorch container.
Both works fine.

$ python3
Python 3.6.9 (default, Jul 17 2020, 12:50:27) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True

Would you mind to try it again?
Please also help to check that if you are using JetPack4.4 GA, which BSP version is rel32.4.3.

$ dpkg-query --show nvidia-l4t-core
nvidia-l4t-core	32.4.3-20200625213407

Thanks.

On Python3 it says true but,
I am using C++ Libtorch.
Using C++ on a DGPU machine torch::cuda::cudnn_is_available() says true but on Jetson Xavier NX it says false.
Here is my makefile:-

APP:= back-to-back-detectors
CC:= g++
TARGET_DEVICE = $(shell gcc -dumpmachine | cut -f1 -d -)
PYTHON_HEADER_DIR := $(shell python3 -c 'from distutils.sysconfig import get_python_inc; print(get_python_inc())')
PYTORCH_INCLUDES := $(shell python3 -c 'from torch.utils.cpp_extension import include_paths; [print(p) for p in include_paths()]')
PYTORCH_LIBRARIES := $(shell python3 -c 'from torch.utils.cpp_extension import library_paths; [print(p) for p in library_paths()]')
INCLUDE_DIRS += $(PYTHON_HEADER_DIR)
INCLUDE_DIRS += $(PYTORCH_INCLUDES)
COMMON_FLAGS += $(foreach includedir,$(INCLUDE_DIRS),-I$(includedir)) -DTORCH_API_INCLUDE_EXTENSION_H -D_GLIBCXX_USE_CXX11_ABI=0
CFLAGS= -O3 -fopenmp -march=native -fpermissive $(COMMON_FLAGS)
CUDA_VER=10.2
NVDS_VERSION:=5.0

LIB_INSTALL_DIR?=/opt/nvidia/deepstream/deepstream-$(NVDS_VERSION)/lib

ifeq ($(TARGET_DEVICE),aarch64)
  CFLAGS:= -DPLATFORM_TEGRA
endif

SRCS:= $(wildcard *.cpp)

INCS:= $(wildcard *.h)

PKGS:= gstreamer-1.0 gstreamer-base-1.0 gstreamer-video-1.0 x11 opencv

OBJS:= $(SRCS:.cpp=.o)

CFLAGS+= -fPIC -DDS_VERSION=\"5.0.0\" \
-I /usr/local/cuda-$(CUDA_VER)/include \
-I$(DS_SDK_ROOT)/sources/includes 
 
CFLAGS+= `pkg-config --cflags $(PKGS)`

LIBS:= `pkg-config --libs $(PKGS)`


LIBS+= -Wl,-no-undefined \
       -L$(LIB_INSTALL_DIR) -lnvdsgst_meta -lnvds_meta -lnvdsgst_helper -lnvdsgst_meta -lnvds_meta -lnvbufsurface -lnvbufsurftransform\
	   -Wl,-rpath,$(LIB_INSTALL_DIR)\
	   -L/usr/local/cuda-$(CUDA_VER)/lib64/ -lcudart -ldl \
	   -lnppc -lnppig -lnpps -lnppicc -lnppidei


LIBS+= -Wl,-no-undefined \
       -L$(PYTORCH_LIBRARIES) -ltorch -lc10 -lgomp -lnvToolsExt -lc10_cuda -ltorch_cuda -ltorch_cpu\
	   -Wl,-rpath,$(PYTORCH_LIBRARIES)
	  


all: $(APP)

%.o: %.cpp $(INCS) Makefile
	$(CC) -c -o $@ $(CFLAGS) $<

$(APP): $(OBJS) Makefile
	$(CC) -o $(APP) $(OBJS) $(LIBS)

clean:
	rm -rf $(OBJS) $(APP)

Can I know the torch.backends.cudnn.version()? Maybe my CuDNN is the issue, even though I did not change it from the default JetPack 4.4.

Hi,

The package is built with cuDNN v8.0.0 from JetPack4.4.

Would you mind to share a simple reproducible app with us?
If this can be reproduced in our environment, we can pass it to our internal team for comment.

Thanks.

nvSample.zip (17.8 KB)

Just run ./sampleLibtorch

Source code is this-
    int main()

    {

        

    try {

          

        std::cout << torch::cuda::is_available() << std::endl;

        torch::Tensor tensor = at::tensor({ -1, 1 }, at::kCUDA);

    }

    catch (std::exception& ex) {

        std::cout << ex.what() << std::endl;

    }

        

    }

Even torch::cuda::is_available() shows false in .cpp. On Python3 it says true.

Hi,

Sorry for the late update.

We can reproduce this issue on l4t-pytorch:r32.4.3-pth1.6-py3 container.
Will update here once we have further finding.

Thanks.

Hi,

This issue can be fixed by updating Makefile with “-Wl,–no-as-needed”.

The root cause of cuda::is_available() returns false is that CUDA context cannot be created correctly.
You can find more message when calling at::detail::getCUDAHooks().showConfig().

Cannot query detailed CUDA version without ATen_cuda library. PyTorch splits its backend into two shared libraries: a CPU library and a CUDA library; this error has occurred because you are trying to use some CUDA functionality, but the CUDA library has not been loaded by the dynamic linker for some reason. The CUDA library MUST be loaded, EVEN IF you don’t directly use any symbols from the CUDA library! One common culprit is a lack of -Wl,–no-as-needed in your link arguments; many dynamic linkers will delete dynamic library dependencies if you don’t depend on any of their symbols. You can check if this has occurred by using ldd on your binary to see if there is a dependency on *_cuda.so library.

As indicates, CUDA library doesn’t load since the dependencies is removed by dynamic linkers.
After adding the *-Wl,–no-as-needed" config, we can get cuda::is_available() return True on XavierNX.

diff --git a/Makefile b/Makefile
index 0f69c2c..6939e75 100644
--- a/Makefile
+++ b/Makefile
@@ -48,7 +48,7 @@ LIBS+= -Wl,-no-undefined \
        -lnppc -lnppig -lnpps -lnppicc -lnppidei


-LIBS+= -Wl,-no-undefined \
+LIBS+= -Wl,-no-undefined -Wl,--no-as-needed\
        -L$(PYTORCH_LIBRARIES) -ltorch -lc10 -lgomp -lnvToolsExt -lc10_cuda -ltorch_cuda -ltorch_cpu\
        -Wl,-rpath,$(PYTORCH_LIBRARIES)

diff --git a/main.cpp b/main.cpp

Thanks.

1 Like

Yes, that fixed it. Thank you.