Hi,
So I have installed the CUDA 11.8 and CUDNN 8.7.0.
When I try to run the a simple program on Tensorflow 2.13 is says no CUDA Device is recognized.
When I run nvcc --version I got this:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
I run this:
python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
I get this output:
2023-08-19 22:32:08.294113: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-08-19 22:32:08.418078: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-08-19 22:32:08.925967: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2023-08-19 22:32:20.686131: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:268] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
tf.Tensor(-1983.4836, shape=(), dtype=float32)
This is running on laptop with dedicated RTX 3070 TI (Laptop).
I decided to check which GPU is being used by running this command:
glxinfo |grep -e OpenGL.vendor -e OpenGL.renderer
I got this:
OpenGL vendor string: Intel
OpenGL renderer string: Mesa Intel(R) Graphics (ADL GT2)
I ran this:
nvidia-smi
Got this:
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
Then ran this:
dkms status
Got this:
nvidia/520.61.05, 5.19.0-32-generic, x86_64: installed
And ran this:
sudo update-initramfs -u
Got this:
update-initramfs: Generating /boot/initrd.img-5.19.0-32-generic
I: The initramfs will attempt to resume from /dev/nvme0n1p9
I: (UUID=32d51fe8-c309-4296-97a0-65e0aedf6dba)
I: Set the RESUME variable to override this.
Then restarted the laptop and did:
nvidia-smi
And got this again:
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
This what I got in my Nvidia Settings GUI:
How do I force to use the RTX 3070 TI?
PS.: I have secured boot enabled, during CUDA Installation asked to set a password to enroll the MOK Key on UEFI. After CUDA Installation finnish, I wait for a reboot but never happenned! So I rebooted it myself but a UEFI never asked to Enroll the MOK Key. Maybe this is related…
EDIT: I disabled Secure Boot as temporary solution, and the nvida-driver and CUDA started working, but now the Wifi doesn’t work…