cudnnCreate is taking way too long (>4 mins) - Titan V


I just upgraded from a 1080 to a Titan V. After that, I experienced a substantial delay when running the cudnn samples.

Cuda version: 8.0
cudnn version: 7.0.5
Cuda driver: 387.34
OS: Ubuntu 14.04

To demonstrate the delay, I modified the cudnn “conv_sample” code of the official cudnn v7 samples.
I timed the call to cudnnCreate as follows:

printf("Creating cudnn handle\n");
double start = second();
double stop = second();
printPerf( stop - start, 0, 0,
           0, 0, 0, 0);

When running it I get the following output:

Testing single precision
Creating cudnn handle
^^^^ CUDA : elapsed = 264.361 sec,  
Testing conv
^^^^ CUDA : elapsed = 8.4877e-05 sec,  
Testing half precision (math in single precision)
Creating cudnn handle
^^^^ CUDA : elapsed = 0.000301838 sec,  
Testing conv
^^^^ CUDA : elapsed = 5.00679e-05 sec,  

Does anyone have any ideas how to debug or narrow down this problem? Just fyi: I also tried running it without checkCudnnErr and experience the same problem. I also experience this problem when running other high-level libraries that use cudnn and on all other cudnn v7 samples.


switch to cuda 9.1 and the latest cudnn

cuda 8 is not recommended for use on sm_70 devices, you may be running into extremely long JIT compilation delays as the underlying libraries (e.g. CUBLAS) that cudnn depends on, are being translated for use on your sm_70 TItan V.

Thanks! That solved the problem or at least shifted it. I was using cuda 8 due to restrictions of other libraries. So I will try to make them run with cuda 9.1.

We created a new “Deep Learning Training and Inference” section in Devtalk to improve the experience for deep learning and accelerated computing, and HPC users:

We are moving active deep learning threads to the new section.

URLs for topics will not change with the re-categorization. So your bookmarks and links will continue to work as earlier.