"Could not find cuda drivers on your machine, GPU will not be used."

Description

I am trying to perform a computer vision task but the process is getting killed

Environment

TensorRT Version:
GPU Type: Nvidia Geforce RTX 3060
Nvidia Driver Version:
CUDA Version: 11.2
CUDNN Version:
Operating System + Version: Ubuntu 22.04
Python Version (if applicable): 3.9
TensorFlow Version (if applicable): 2.16
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please find my code here: fundus_image_segmentation.py

The script initially starts running and shows this output: 2024-06-18 12:51:05.107917: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used. 2024-06-18 12:51:05.166219: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used. 2024-06-18 12:51:05.369024: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-06-18 12:51:06.466512: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 2024-06-18 12:51:30.910354: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:282] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected Model: "functional_1" ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ Connected to ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ input_layer (InputLayer) │ (None, None, None, 3) │ 0 │ - │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d (Conv2D) │ (None, None, None, 16) │ 448 │ input_layer[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization │ (None, None, None, 16) │ 64 │ conv2d[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation (Activation) │ (None, None, None, 16) │ 0 │ batch_normalization[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_1 (Conv2D) │ (None, None, None, 16) │ 2,320 │ activation[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_1 │ (None, None, None, 16) │ 64 │ conv2d_1[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_1 (Activation) │ (None, None, None, 16) │ 0 │ batch_normalization_1[0][… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ max_pooling2d (MaxPooling2D) │ (None, None, None, 16) │ 0 │ activation_1[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_2 (Conv2D) │ (None, None, None, 32) │ 4,640 │ max_pooling2d[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_2 │ (None, None, None, 32) │ 128 │ conv2d_2[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_2 (Activation) │ (None, None, None, 32) │ 0 │ batch_normalization_2[0][… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_3 (Conv2D) │ (None, None, None, 32) │ 9,248 │ activation_2[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_3 │ (None, None, None, 32) │ 128 │ conv2d_3[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_3 (Activation) │ (None, None, None, 32) │ 0 │ batch_normalization_3[0][… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ max_pooling2d_1 │ (None, None, None, 32) │ 0 │ activation_3[0][0] │ │ (MaxPooling2D) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_4 (Conv2D) │ (None, None, None, 64) │ 18,496 │ max_pooling2d_1[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_4 │ (None, None, None, 64) │ 256 │ conv2d_4[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_4 (Activation) │ (None, None, None, 64) │ 0 │ batch_normalization_4[0][… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_5 (Conv2D) │ (None, None, None, 64) │ 36,928 │ activation_4[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_5 │ (None, None, None, 64) │ 256 │ conv2d_5[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_5 (Activation) │ (None, None, None, 64) │ 0 │ batch_normalization_5[0][… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ max_pooling2d_2 │ (None, None, None, 64) │ 0 │ activation_5[0][0] │ │ (MaxPooling2D) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_6 (Conv2D) │ (None, None, None, 128) │ 73,856 │ max_pooling2d_2[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_6 │ (None, None, None, 128) │ 512 │ conv2d_6[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_6 (Activation) │ (None, None, None, 128) │ 0 │ batch_normalization_6[0][… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_7 (Conv2D) │ (None, None, None, 128) │ 147,584 │ activation_6[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_7 │ (None, None, None, 128) │ 512 │ conv2d_7[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_7 (Activation) │ (None, None, None, 128) │ 0 │ batch_normalization_7[0][… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ max_pooling2d_3 │ (None, None, None, 128) │ 0 │ activation_7[0][0] │ │ (MaxPooling2D) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_8 (Conv2D) │ (None, None, None, 256) │ 295,168 │ max_pooling2d_3[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_8 │ (None, None, None, 256) │ 1,024 │ conv2d_8[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_8 (Activation) │ (None, None, None, 256) │ 0 │ batch_normalization_8[0][… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_9 (Conv2D) │ (None, None, None, 256) │ 590,080 │ activation_8[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_9 │ (None, None, None, 256) │ 1,024 │ conv2d_9[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_9 (Activation) │ (None, None, None, 256) │ 0 │ batch_normalization_9[0][… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_transpose │ (None, None, None, 128) │ 131,200 │ activation_9[0][0] │ │ (Conv2DTranspose) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ concatenate (Concatenate) │ (None, None, None, 256) │ 0 │ conv2d_transpose[0][0], │ │ │ │ │ activation_7[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_10 (Conv2D) │ (None, None, None, 128) │ 295,040 │ concatenate[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_10 │ (None, None, None, 128) │ 512 │ conv2d_10[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_10 (Activation) │ (None, None, None, 128) │ 0 │ batch_normalization_10[0]… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_11 (Conv2D) │ (None, None, None, 128) │ 147,584 │ activation_10[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_11 │ (None, None, None, 128) │ 512 │ conv2d_11[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_11 (Activation) │ (None, None, None, 128) │ 0 │ batch_normalization_11[0]… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_transpose_1 │ (None, None, None, 64) │ 32,832 │ activation_11[0][0] │ │ (Conv2DTranspose) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ concatenate_1 (Concatenate) │ (None, None, None, 128) │ 0 │ conv2d_transpose_1[0][0], │ │ │ │ │ activation_5[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_12 (Conv2D) │ (None, None, None, 64) │ 73,792 │ concatenate_1[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_12 │ (None, None, None, 64) │ 256 │ conv2d_12[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_12 (Activation) │ (None, None, None, 64) │ 0 │ batch_normalization_12[0]… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_13 (Conv2D) │ (None, None, None, 64) │ 36,928 │ activation_12[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_13 │ (None, None, None, 64) │ 256 │ conv2d_13[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_13 (Activation) │ (None, None, None, 64) │ 0 │ batch_normalization_13[0]… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_transpose_2 │ (None, None, None, 32) │ 8,224 │ activation_13[0][0] │ │ (Conv2DTranspose) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ concatenate_2 (Concatenate) │ (None, None, None, 64) │ 0 │ conv2d_transpose_2[0][0], │ │ │ │ │ activation_3[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_14 (Conv2D) │ (None, None, None, 32) │ 18,464 │ concatenate_2[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_14 │ (None, None, None, 32) │ 128 │ conv2d_14[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_14 (Activation) │ (None, None, None, 32) │ 0 │ batch_normalization_14[0]… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_15 (Conv2D) │ (None, None, None, 32) │ 9,248 │ activation_14[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_15 │ (None, None, None, 32) │ 128 │ conv2d_15[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_15 (Activation) │ (None, None, None, 32) │ 0 │ batch_normalization_15[0]… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_transpose_3 │ (None, None, None, 16) │ 2,064 │ activation_15[0][0] │ │ (Conv2DTranspose) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ concatenate_3 (Concatenate) │ (None, None, None, 32) │ 0 │ conv2d_transpose_3[0][0], │ │ │ │ │ activation_1[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_16 (Conv2D) │ (None, None, None, 16) │ 4,624 │ concatenate_3[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_16 │ (None, None, None, 16) │ 64 │ conv2d_16[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_16 (Activation) │ (None, None, None, 16) │ 0 │ batch_normalization_16[0]… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_17 (Conv2D) │ (None, None, None, 16) │ 2,320 │ activation_16[0][0] │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ batch_normalization_17 │ (None, None, None, 16) │ 64 │ conv2d_17[0][0] │ │ (BatchNormalization) │ │ │ │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ activation_17 (Activation) │ (None, None, None, 16) │ 0 │ batch_normalization_17[0]… │ ├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤ │ conv2d_18 (Conv2D) │ (None, None, None, 2) │ 290 │ activation_17[0][0] │ └───────────────────────────────┴───────────────────────────┴─────────────────┴────────────────────────────┘ Total params: 1,947,266 (7.43 MB) Trainable params: 1,944,322 (7.42 MB) Non-trainable params: 2,944 (11.50 KB) Killed

Felt the problem could be that GPU might not be properly configured. I am using the Nvidia Geforce RTX 3060 graphics card.

Steps to reproduce

Tried this import tensorflow as tf gives 2024-06-22 01:36:45.623984: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used. 2024-06-22 01:36:45.626502: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used. 2024-06-22 01:36:45.657853: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-06-22 01:36:46.199776: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT

print(tf.test.is_gpu_available()) gives 2024-06-22 01:38:04.613997: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:282] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected False

nvidia-smi gives NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

nvcc --version shows Command 'nvcc' not found, but can be installed with: sudo apt install nvidia-cuda-toolkit

however, /usr/local/cuda/bin/nvcc --version shows nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2020 NVIDIA Corporation Built on Mon_Nov_30_19:08:53_PST_2020 Cuda compilation tools, release 11.2, V11.2.67 Build cuda_11.2.r11.2/compiler.29373293_0

dpkg -l | grep TensorRT and dpkg -l | grep nvinfer do not show anything.

I want to use GPU for this segmentation task and tried searching in different forums like this, this and this but not able to fix this issue. I had downloaded CUDA, CuDNN, and Nvidia drivers from the official sites more than a year back but never used them. I can’t remember the CuDNN version as nvidia-smi is not working and Nvidia drivers version was most possibly 525.

Please help me solve this issue. I am new to Linux(still trying to figure things out), so it would be very helpful if you could suggest in multiple steps.

Any help would be appreciated. Thank you!

Hi @gmohor21 ,
This forum talks about issue related to Tensorrt.
I am afraid, i might not be able to help you here.

Thanks