Hi,
I’m trying to run cudnn_samples_v8 (libcudnn8-samples-8.1.0.77-1.cuda10.2.x86_64.rpm).
conv_sample doesn’t work with CUDA 10.2.89,
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
$
$ modinfo nvidia | grep -i version
version: 440.33.01
rhelversion: 7.5
srcversion: A5E9226CB2A7B16B12DA2CA
vermagic: 3.10.0-862.el7.x86_64 SMP mod_unload modversions
$
$ nvidia-smi -q
==============NVSMI LOG==============
Timestamp : Mon Feb 1 18:00:12 2021
Driver Version : 440.33.01
CUDA Version : 10.2
Attached GPUs : 1
GPU 00000000:B2:00.0
Product Name : Tesla V100-SXM2-16GB
:
:
$ make clean
rm -rf *o
rm -rf conv_sample
$
$ export CUDA_PATH=~/tmp/cuda-10.2.89
$ export CPATH=~/tmp/cudnn-8.1.0/cuda10.2/include
$ export LIBRARY_PATH=~/tmp/cudnn-8.1.0/cuda10.2/lib64
$ make
CUDA_VERSION is 10020
Linking agains cublasLt = true
CUDA VERSION: 10020
TARGET ARCH: x86_64
TARGET OS: linux
SMS: 35 50 53 60 61 62 70 72 75
:
:
$
$ LD_LIBRARY_PATH=~/tmp/cuda-10.2.89/lib64:~/tmp/cudnn-8.1.0/cuda10.2/lib64 make run
CUDA_VERSION is 10020
Linking agains cublasLt = true
CUDA VERSION: 10020
TARGET ARCH: x86_64
TARGET OS: linux
SMS: 35 50 53 60 61 62 70 72 75
./conv_sample
Executing: conv_sample
Using format CUDNN_TENSOR_NCHW (for INT8x4 and INT8x32 tests use CUDNN_TENSOR_NCHW_VECT_C)
Testing single precision
====USER DIMENSIONS====
input dims are 1, 32, 4, 4
filter dims are 32, 32, 1, 1
output dims are 1, 32, 4, 4
====PADDING DIMENSIONS====
padded input dims are 1, 32, 4, 4
padded filter dims are 32, 32, 1, 1
padded output dims are 1, 32, 4, 4
CUDNN error at conv_sample.cpp:1313, code=1 (CUDNN_STATUS_NOT_INITIALIZED) in 'cudnnCreate(&handle_)'
*** Error in `./conv_sample': munmap_chunk(): invalid pointer: 0x00002af9b737d829 ***
:
:
but, if LD_LIBRARY_PATH contains the path to the (forward compatible) user mode driver
of CUDA 11.0.3, it works.
$ LD_LIBRARY_PATH=~/tmp/cuda-11.0.3/compat:~/tmp/cuda-10.2.89/lib64:~/tmp/cudnn-8.1.0/cuda10.2/lib64 make run
CUDA_VERSION is 10020
Linking agains cublasLt = true
CUDA VERSION: 10020
TARGET ARCH: x86_64
TARGET OS: linux
SMS: 35 50 53 60 61 62 70 72 75
./conv_sample
Executing: conv_sample
Using format CUDNN_TENSOR_NCHW (for INT8x4 and INT8x32 tests use CUDNN_TENSOR_NCHW_VECT_C)
Testing single precision
====USER DIMENSIONS====
input dims are 1, 32, 4, 4
filter dims are 32, 32, 1, 1
output dims are 1, 32, 4, 4
====PADDING DIMENSIONS====
padded input dims are 1, 32, 4, 4
padded filter dims are 32, 32, 1, 1
padded output dims are 1, 32, 4, 4
Testing conv
^^^^ CUDA : elapsed = 0.650839 sec,
Test PASSED
Testing half precision (math in single precision)
====USER DIMENSIONS====
input dims are 1, 32, 4, 4
filter dims are 32, 32, 1, 1
output dims are 1, 32, 4, 4
====PADDING DIMENSIONS====
padded input dims are 1, 32, 4, 4
padded filter dims are 32, 32, 1, 1
padded output dims are 1, 32, 4, 4
Testing conv
^^^^ CUDA : elapsed = 0.000592947 sec,
Test PASSED
$
Are there any other solutions?
This sample code is the same as that of libcudnn8-samples_8.0.5.39-1+cuda10.2_amd64.deb,
so I guess cuDNN 8.1.0 libraries depend on CUDA 11.
Is it impossible to avoid this dependency?