kernel version 440.31.0 does not match DSO version 440.33.1 — cannot find working devices in this configuration

I am using GeForce GTX 1660 Ti Graphics card with Ubuntu 18.04 version. I am getting below error.

2019-11-24 21:36:48.694996: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1

2019-11-24 21:36:48.695789: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: CUDA_ERROR_SYSTEM_DRIVER_MISMATCH: system has unsupported display driver / cuda driver combination

2019-11-24 21:36:48.695814: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: -Z390-M

2019-11-24 21:36:48.695820: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: -Z390-M

2019-11-24 21:36:48.695858: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: 440.33.1

2019-11-24 21:36:48.695875: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 440.31.0

2019-11-24 21:36:48.695880: E tensorflow/stream_executor/cuda/cuda_diagnostics.cc:313] kernel version 440.31.0 does not match DSO version 440.33.1 – cannot find working devices in this configuration

I tried to upgrade the Nvidia driver to 440.31.0 from 435 variant which comes from the Ubuntu Distribution by default, but still getting this version mismatch.

Please let me know of the version which best matches both Nivida drivers and Cuda library. I am not able to use my Graphics card for Tensor Flow computation.

Please try this:

  • purge anything nvidia/cuda
  • add the ubuntu graphics ppa https://launchpad.net/~graphics-drivers/+archive/ubuntu/ppa
  • install the driver from that (sudo apt install nvidia-driver-440)
  • download the cuda .deb
  • add the repo to your system (first three steps from install instructions on download page)
  • don’t install cuda
  • instead, run sudo apt install cuda-toolkit-10-2
  • set PATH variable if necessary

Hi Team,

I have installed the latest version of the Nvidia Graphics Driver. But still I am getting below error. I have gone ahead with CUDA 10.2 toolkit installation. I am still not able to detect the graphics card and below is the error.

2019-12-05 12:32:43.601925: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-12-05 12:32:43.625802: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2904000000 Hz
2019-12-05 12:32:43.626329: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x3f48d90 executing computations on platform Host. Devices:
2019-12-05 12:32:43.626357: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
2019-12-05 12:32:43.629146: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2019-12-05 12:32:43.630565: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: CUDA_ERROR_SYSTEM_DRIVER_MISMATCH: system has unsupported display driver / cuda driver combination
2019-12-05 12:32:43.630604: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: vissu-Z390-M
2019-12-05 12:32:43.630617: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: vissu-Z390-M
2019-12-05 12:32:43.630676: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: 440.33.1
2019-12-05 12:32:43.630707: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 440.36.0
2019-12-05 12:32:43.630719: E tensorflow/stream_executor/cuda/cuda_diagnostics.cc:313] kernel version 440.36.0 does not match DSO version 440.33.1 -- cannot find working devices in this configuration
Please install GPU version of TF

Below is the install log for my CUDA tool kit installation.

DKMS: install completed.
Setting up nvidia-settings (440.33.01-0ubuntu1) ...
Setting up cuda-cusolver-dev-10-2 (10.2.89-1) ...
Setting up cuda-nsight-compute-10-2 (10.2.89-1) ...
Setting up libnvidia-ifr1-440:amd64 (440.33.01-0ubuntu1) ...
Setting up libxcb-present-dev:amd64 (1.13-2~ubuntu18.04) ...
Setting up cuda-gdb-10-2 (10.2.89-1) ...
Setting up libxi-dev:amd64 (2:1.7.9-1) ...
Setting up vdpau-driver-all:amd64 (1.1.1-3ubuntu1) ...
Setting up cuda-cudart-dev-10-2 (10.2.89-1) ...
Setting up cuda-libraries-dev-10-2 (10.2.89-1) ...
Setting up libgl1-mesa-dev:amd64 (19.0.8-0ubuntu0~18.04.3) ...
Setting up nvidia-driver-440 (440.33.01-0ubuntu1) ...
Setting up cuda-visual-tools-10-2 (10.2.89-1) ...
Setting up cuda-cupti-10-2 (10.2.89-1) ...
Setting up libglu1-mesa-dev:amd64 (9.0.0-2.1build1) ...
Setting up cuda-drivers (440.33.01-1) ...
Setting up cuda-cupti-dev-10-2 (10.2.89-1) ...
Setting up freeglut3-dev:amd64 (2.8.1-3) ...
Setting up cuda-command-line-tools-10-2 (10.2.89-1) ...
Setting up cuda-runtime-10-2 (10.2.89-1) ...
Setting up cuda-tools-10-2 (10.2.89-1) ...
Setting up cuda-demo-suite-10-2 (10.2.89-1) ...
Setting up cuda-samples-10-2 (10.2.89-1) ...
Setting up cuda-documentation-10-2 (10.2.89-1) ...
Setting up cuda-toolkit-10-2 (10.2.89-1) ...
Setting up cuda-10-2 (10.2.89-1) ...
Setting up cuda (10.2.89-1) ...
Processing triggers for gnome-menus (3.13.3-11ubuntu1.1) ...
Processing triggers for dbus (1.12.2-1ubuntu1.1) ...
Processing triggers for mime-support (3.60ubuntu1) ...
Processing triggers for desktop-file-utils (0.23-1ubuntu3.18.04.2) ...
Processing triggers for libc-bin (2.27-3ubuntu1) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
Processing triggers for initramfs-tools (0.130ubuntu3.9) ...
update-initramfs: Generating /boot/initrd.img-5.0.0-37-generic

After CUDA installation did the Nvidia driver fall back to the ubuntu distribution supported version. Some how I am not able to work with latest versions of both of them.

I could get the issue sorted by following below steps from Tensorflow installation page.

# Add NVIDIA package repositories
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo apt-get update
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt-get update

# Install NVIDIA driver
sudo apt-get install --no-install-recommends nvidia-driver-418
# Reboot. Check that GPUs are visible using the command: nvidia-smi

# Install development and runtime libraries (~4GB)
sudo apt-get install --no-install-recommends \
    cuda-10-0 \
    libcudnn7=7.6.2.24-1+cuda10.0  \
    libcudnn7-dev=7.6.2.24-1+cuda10.0


# Install TensorRT. Requires that libcudnn7 is installed above.
sudo apt-get install -y --no-install-recommends libnvinfer5=5.1.5-1+cuda10.0 \
    libnvinfer-dev=5.1.5-1+cuda10.0

But after this I am landing into other issue

tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node sequential/conv2d/Conv2D (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1751) ]] [Op:__inference_distributed_function_1055]

Function call stack:
distributed_function

Complete log of the execution which detects the GPU but going on further execution has the issue with Convolution Algorithm

2019-12-06 10:18:12.039958: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2019-12-06 10:18:12.133533: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 10:18:12.133851: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1660 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:01:00.0
2019-12-06 10:18:12.155556: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-12-06 10:18:12.399086: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-12-06 10:18:12.474970: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-12-06 10:18:12.506992: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-12-06 10:18:12.684998: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-12-06 10:18:12.889577: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-12-06 10:18:13.436820: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-12-06 10:18:13.436949: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 10:18:13.437266: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 10:18:13.437496: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-12-06 10:18:13.447843: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-12-06 10:18:13.570270: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2904000000 Hz
2019-12-06 10:18:13.571262: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x46658b0 executing computations on platform Host. Devices:
2019-12-06 10:18:13.571278: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
2019-12-06 10:18:13.702467: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 10:18:13.702816: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x48a8a50 executing computations on platform CUDA. Devices:
2019-12-06 10:18:13.702832: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): GeForce GTX 1660 Ti, Compute Capability 7.5
2019-12-06 10:18:13.703008: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 10:18:13.703238: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1660 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.77
pciBusID: 0000:01:00.0
2019-12-06 10:18:13.703285: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-12-06 10:18:13.703298: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-12-06 10:18:13.703309: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-12-06 10:18:13.703320: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-12-06 10:18:13.703350: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-12-06 10:18:13.703362: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-12-06 10:18:13.703388: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-12-06 10:18:13.703454: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 10:18:13.703696: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 10:18:13.703909: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-12-06 10:18:13.716286: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-12-06 10:18:13.721510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-06 10:18:13.721546: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2019-12-06 10:18:13.721553: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2019-12-06 10:18:13.746217: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 10:18:13.746542: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 10:18:13.746788: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5023 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 30, 30, 32)        896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 15, 15, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 13, 13, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 64)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 4, 4, 64)          36928     
=================================================================
Total params: 56,320
Trainable params: 56,320
Non-trainable params: 0
_________________________________________________________________
2019-12-06 10:18:23.125235: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 1228800000 exceeds 10% of system memory.
Train on 50000 samples, validate on 10000 samples
Epoch 1/10
2019-12-06 10:18:24.531505: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-12-06 10:18:25.824977: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-12-06 10:18:29.847942: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-12-06 10:18:29.855837: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-12-06 10:18:29.862207: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[{{node sequential/conv2d/Conv2D}}]]
   32/50000 [..............................] - ETA: 2:40:23Traceback (most recent call last):
  File "image_classifier_train.py", line 48, in <module>
    validation_data=(test_images, test_labels))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training.py", line 728, in fit
    use_multiprocessing=use_multiprocessing)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_v2.py", line 324, in fit
    total_epochs=epochs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_v2.py", line 123, in run_one_epoch
    batch_outs = execution_function(iterator)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 86, in execution_function
    distributed_function(input_fn))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/def_function.py", line 457, in __call__
    result = self._call(*args, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/def_function.py", line 520, in _call
    return self._stateless_fn(*args, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1823, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1141, in _filtered_call
    self.captured_inputs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1224, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 511, in call
    ctx=ctx)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node sequential/conv2d/Conv2D (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1751) ]] [Op:__inference_distributed_function_1055]

Function call stack:
distributed_function

Cuda version

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

nvidia-smi gives me following result.

Fri Dec  6 11:24:26 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 166...  On   | 00000000:01:00.0  On |                  N/A |
|  0%   45C    P8    14W / 120W |    578MiB /  5941MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1024      G   /usr/lib/xorg/Xorg                            18MiB |
|    0      1207      G   /usr/bin/gnome-shell                          48MiB |
|    0      1421      G   /usr/lib/xorg/Xorg                           225MiB |
|    0      1566      G   /usr/bin/gnome-shell                         183MiB |
|    0      2027      G   ...equest-channel-token=917536326669520771    14MiB |
|    0      2037      G   ...uest-channel-token=14972771615087028380    83MiB |
+-----------------------------------------------------------------------------+

I have tried to fallback to CUDA 9.0 version based on some posts but some how, error still persists.

Please help me in this regard as I am not able to use the Graphics card capability for Tensor Flow computation, but my purpose was the same to get it.

Thanks in Advance
Viswanath.B

Got the issue fixed, after installing the nightly build for tensorflow 2.0

Thanks for support.
Viswanath.B