Updating to Centos 7.7 fails

A “yum update” to take the system from Centos 7.6 to Centos 7.7 fails. Ending in this:

→ Finished Dependency Resolution
Error: Could not find suitable Nvidia kernel module version for kernel kernel-3.10.0-862.el7.x86_64 and driver 3:nvidia-driver-latest-418.87.00-2.el7.x86_64

Currently installed versions of key packages:
cuda-10.1.243-1.x86_64
dkms-2.7.1-1.el7.noarch
kmod-nvidia-latest-dkms-418.87.00-2.el7.x86_64
nvidia-driver-latest-418.87.00-2.el7.x86_64
kernel-3.10.0-862.el7.x86_64
kernel-3.10.0-957.27.2.el7.x86_64

Currently running kernel 3.10.0-957.27.2.el7.x86_64

Centos 7.7 kernel will be kernel.x86_64 0:3.10.0-1062.1.1.el7

nvidia-smi
Mon Sep 23 16:02:58 2019
±----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-PCIE… Off | 00000000:00:08.0 Off | 0 |
| N/A 33C P0 36W / 250W | 0MiB / 32480MiB | 3% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+

That is generally expected. The GPU driver is compiled against the specific kernel that you are running. If you update the kernel, by definition you break the GPU driver install.

Simplest approach at this point is to just reinstall the driver, or reinstall CUDA.

If the GPU driver is registered with dkms, dkms may fix this, but there are possible issues that can trip that up, such as an incompatibility between the driver and the kernel which prevents successful compilation. It might also just require a restart to trigger dkms to rebuild the GPU driver interface.

Thanks crovella4

So I got past the error with:

yum remove cuda
yum update # to install Centos 7.7
reboot
yum install nvidia-driver-latest-dkms cuda

I’m having this problem but don’t use cuda. What repo are you using? Can I install that
repo then do yum install nvida-driver-latest-dkms?

I installed the cuda repo like this:
yum-config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo

I guess you can install the repo and then try:

yum install nvidia-driver-latest-dkms

But I’ve never tried this as I need cuda also. It may end up pulling in cuda anyway.

Also, when I originally did:
yum remove cuda
It also removed nvidia-driver-latest-dkms

I did the install and it seems to have built the driver ok but it won’t load it:

Sep 23 20:38:03 smooge dracut: Omitting driver nvidia_modeset
Sep 23 20:38:03 smooge dracut: Omitting driver nvidia_drm
Sep 23 20:38:03 smooge dracut: Omitting driver nvidia_uvm
Sep 23 20:38:03 smooge dracut: Omitting driver nvidia
Sep 23 20:38:03 smooge dracut: Omitting driver nvidia_modeset
Sep 23 20:38:03 smooge dracut: Omitting driver nvidia_drm
Sep 23 20:38:03 smooge dracut: Omitting driver nvidia_uvm
Sep 23 20:38:03 smooge dracut: Omitting driver nvidia
Sep 23 20:38:03 smooge dracut: Omitting driver nvidia_modeset
Sep 23 20:38:03 smooge dracut: Omitting driver nvidia_drm
Sep 23 20:38:03 smooge dracut: Omitting driver nvidia_uvm
Sep 23 20:38:03 smooge dracut: Omitting driver nvidia

Any ideas?