Could not find cuda drivers on your machine, GPU will not be used

After a lot of work I managed to get tensorflow to find tensorRT.
I have:
4090 RTX pc on ubuntu 22.04
cuda 11.8
cuDNN: 8.9.7
tensorflow 2.14

nvidia-smi does not work. It did at first but after uninstalling cuda and cuDNN a couple of times to get things working I lost this it seems. How can I fix this?

NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

tensorflow complains also of course that there are no drivers.
Running this:
find /usr/lib/modules -name nvidia.ko -exec modinfo {} \;

Results in:

filename: /usr/lib/modules/6.5.0-21-generic/updates/dkms/nvidia.ko
alias: char-major-195-*
version: 550.54.14
supported: external
license: NVIDIA
firmware: nvidia/550.54.14/gsp_tu10x.bin
firmware: nvidia/550.54.14/gsp_ga10x.bin
srcversion: 1BB26F5DEAC3FABF74093C6
alias: pci:v000010DEdsvsdbc06sc80i00
alias: pci:v000010DEdsvsdbc03sc02i00
alias: pci:v000010DEdsvsdbc03sc00i00
depends: drm
retpoline: Y
name: nvidia
vermagic: 6.5.0-21-generic SMP preempt mod_unload modversions
sig_id: PKCS#7

All of a sudden I got everything working.
I did this:

Followed Mart suggestion of “Installer Mess”:

Mart

Jan 17 '22

Oh goodness! You made an installer mess!
First: do not use the .run file installer, if there is not a good reason to. It even advises you not to use itself before installing!
Second: do not mix distro and .run file installations. Always remove either of them before using the other!
Third: If using the runfile installer, make sure to not use it, when X is still running!

Ok, lets try to clean up…

  • run sudo apt purge 'libnvidia*' 'nvidia*'
  • start the .run file installer again with the --uninstall parameter.
  • look for files blacklisting the nvidia driver: grep -r -e "blacklist.*nvidia$" /etc/modprobe.d /lib/modprobe.d - and delete them if found (sidenote: nvidiafb needs to be blacklisted).
  • run sudo apt install nvidia-driver-470 (or -495 as you wish) - watch the output closely, to look for errors.
  • check the dkms for success: dkms status|grep nvidia - should say “installed”.
  • run sudo prime-select nvidia or sudo prime-select on-demand (if you only want to render certain applications on the nvidia gpu (Chapter 34. PRIME Render Offload )
  • reboot.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.