Hello,
I am trying to run the same benchmarks of tensorflow and cuDNN on our 3 nvidia cards:
M40,
K80 and P100
we have cuDNN5, cuda 8 end tensorflow1.0.0 for python3.
The bench I am trying to run can be found here:
https://github.com/soumith/convnet-benchmarks
While all the benckmaks run fine on the M40 and K80 cards, they all fail when I try to run them on the P100.
Here is the error message I get:
$ python3 benchmark_alexnet.py
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn’t compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn’t compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla P100-PCIE-16GB
major: 6 minor: 0 memoryClockRate (GHz) 0.405
pciBusID 0000:04:00.0
Total memory: 15.89GiB
Free memory: 15.61GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:04:00.0)
F tensorflow/stream_executor/cuda/cuda_dnn.cc:2001] failed to enqueue convolution on stream: CUDNN_STATUS_EXECUTION_FAILED
I have tested the cuDNN5 installation with the code samples provided here:
https://developer.nvidia.com/rdp/cudnn-archive
The test ran successfully.
Both my LD_LIBRARY_PATH and my PATH variable are set correctly.
What do you think could be the cause of this?
I may be something very basic…
I thank you in advance for any hint,
Regards,
Véronique