Drivers not loading after CUDA Toolkit installation on RHEL 7.9

Installed CUDA Toolkit via package manager installation. All seemed ok with the install, but when I go to do the post-installation steps, the NVIDIA persistence daemon fails to start. I also checked functionality with nvidia-smi and it fails because it cannot communicate with the NVIDIA driver.

All system requirements appear to be met for a successful install. The system gcc is version 4.8.5, but Redhat Dev Tools are installed, which include gcc 7.3.1.

Can anyone help me get this running on my system? Thank you.

Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

Here is the bug report file.
nvidia-bug-report.log.gz (75.2 KB)

There doesn’t seem to be any driver installed, nouveau is also not blacklisted. How exactly did you install cuda/the nvidia driver? Please post the output of
yum list installed |grep nvidia

Interesting… The package manager installation didn’t instruct to blacklist nouveau. I can blacklist it, but lsmod | grep nouveau doesn’t return anything.

I installed with the following commands:

sudo yum install nvidia-driver-latest-dkms
sudo yum install cuda
sudo yum install cuda-drivers

I downloaded and installed different drivers to the ones that came with the toolkit rpm to see if that helped. It didn’t make a difference.

See attached image with output of yum list installed | grep nvidia.

Please post the output of
dkms status

$ dkms status
nvidia/510.47.03: added

Please run
sudo dkms install nvidia/510.47.03
post any errors that are given and attach the make log it refers.

When running the above, noticed a mismatch between the installed kernel and the headers, so I upgraded kernel to match headers, rebooted and dkms status changed to the below:

$ dkms status
nvidia/510.47.03, 3.10.0-1160.59.1.el7.x86_64, x86_64: installed

$ sudo dkms install nvidia/510.47.03
Module nvidia/510.47.03 already installed on kernel 3.10.0-1160.59.1.el7.x86_64 (x86_64).

nvidia-smi is still unable to communicate with driver. Attached find a fresh nvidia-bug-report.log.gz.

nvidia-bug-report.log.gz (79.1 KB)

The driver is now available but doesn’t load. Please run
sudo modprobe nvidia
and post any errors.

$ sudo modprobe nvidia
modprobe: ERROR: could not insert ‘nvidia’: Required key not available

You have secure boot enabled, please disable it in bios.

Disabling secure boot finished the job. nvidia-smi now works! Thanks for the help!

Question: Does the driver need to be re-installed/rebuilt when the kernel is upgraded? If so, how do you go about doing that?

If using dkms, the driver modules will be built automatically, as long you have the corresponding kernel headers installed.