Unable to install nvidia-driver:open-dkms on Rocky 9: Error: Problems in request: broken groups or modules

Hi, I’m trying to install nvidia-driver on AWS EC2 running Rocky 9. I’m doing this via automation, and it was working yesterday fine, but now when I’m trying to provision new server, it fails.

I tried to went over the installation process manually following NVIDIA Driver Installation Guide. Everything goes well untill 6.4. Driver Installation section:
Command dnf module install nvidia-driver:open-dkms fails with following error:

cuda-rhel9-x86_64                                                                                                                             15 MB/s | 2.3 MB     00:00    
Extra Packages for Enterprise Linux 9 - x86_64                                                                                                20 MB/s |  23 MB     00:01    
Extra Packages for Enterprise Linux 9 openh264 (From Cisco) - x86_64                                                                         2.4 kB/s | 2.5 kB     00:01    
Rocky Linux 9 - BaseOS                                                                                                                       2.6 MB/s | 3.4 MB     00:01    
Rocky Linux 9 - AppStream                                                                                                                    2.8 MB/s | 9.1 MB     00:03    
Rocky Linux 9 - Extras                                                                                                                        17 kB/s |  16 kB     00:00    
Unable to resolve argument nvidia-driver:open-dkms
No match for package kmod-nvidia-open-dkms
Unable to resolve argument nvidia-driver:open-dkms
No match for package libnvidia-cfg
Unable to resolve argument nvidia-driver:open-dkms
No match for package libnvidia-fbc
Unable to resolve argument nvidia-driver:open-dkms
No match for package libnvidia-ml
Unable to resolve argument nvidia-driver:open-dkms
No match for package nvidia-driver
Unable to resolve argument nvidia-driver:open-dkms
No match for package nvidia-driver-cuda
Unable to resolve argument nvidia-driver:open-dkms
No match for package nvidia-driver-cuda-libs
Unable to resolve argument nvidia-driver:open-dkms
No match for package nvidia-driver-libs
Unable to resolve argument nvidia-driver:open-dkms
No match for package nvidia-kmod-common
Unable to resolve argument nvidia-driver:open-dkms
No match for package nvidia-libXNVCtrl
Unable to resolve argument nvidia-driver:open-dkms
No match for package nvidia-libXNVCtrl-devel
Unable to resolve argument nvidia-driver:open-dkms
No match for package nvidia-modprobe
Unable to resolve argument nvidia-driver:open-dkms
No match for package nvidia-persistenced
Unable to resolve argument nvidia-driver:open-dkms
No match for package nvidia-settings
Unable to resolve argument nvidia-driver:open-dkms
No match for package nvidia-xconfig
Error: Problems in request:
broken groups or modules: nvidia-driver:open-dkms, nvidia-driver:open-dkms, nvidia-driver:open-dkms, nvidia-driver:open-dkms, nvidia-driver:open-dkms, nvidia-driver:open-dkms, nvidia-driver:open-dkms, nvidia-driver:open-dkms, nvidia-driver:open-dkms, nvidia-driver:open-dkms, nvidia-driver:open-dkms, nvidia-driver:open-dkms, nvidia-driver:open-dkms, nvidia-driver:open-dkms, nvidia-driver:open-dkms

I have a feeling that something has changed in repo, but maybe there is a way how to workaround this issue?

Cheers, Nick

You are right, a LOT has changed in the last couple of days. My recent experience perhaps won’t help you a lot, other than confirming that.

I use Fedora. To keep it up to date I was running 12.6 update 3 on Fedora 40 (to keep Fedora supported) which was ok for me but obviously Cuda is not supported. Getting to F40 from 39 and upgrading cuda to the next version had been (relatively) straightforward, without having to remove and reinstall.

I have been keeping an eye on the current Cuda version because soon I will need to get to Fedora 41. A couple of days ago I notice that with no fanfare that I could see 12.6 update 3 had changed to 12.8.

The sequence has been: fix the repo, upgrade the driver:
sudo dnf module install nvidia-driver:latest-dkms
Then upgrade Fedora and then you can upgrade cuda with another sudo dnf upgrade --refresh, if it didnt do it with the Fedora major version upgrade.

So I fixed the cuda repo to 41 but when I tried to do the driver, I got the same messages as you. When I did:
sudo dnf upgrade --refresh
I was a bit surprised it upgraded cuda-toolkit to 12.8 correctly without complaining (I install cuda-toolkit because it allows me to sort the driver separately)
Next:
sudo dnf module reset nvidia-driver
sudo dnf module install nvidia-driver:latest-dkms
and the driver install worked.
reboot.

Then to get to F41 I did the usual command line Fedora major version upgrade but this time it did not work without adding:
–allowerasing

sudo dnf system-upgrade download --releasever=41 --allowerasing

and everything seems to work without removing anything other than an old kernel. I am afraid I am bewildered (not for the first time) as to what exactly has now changed, which is little help to you.
I think sometimes you have to really want to use Cuda…

The driver instructions have changed substantially. You may find this helpful:

This time I used (card supports open drivers):

sudo dnf install nvidia-open

And it worked.

If ncu or nsys gui versions complain about Opengl needs to be > 2.0 then chances are your login has defaulted to Wayland. (you can check with “Nvidiia X Server Settings” gui or other means). It never used to do Wayland…
I logged out and chose “Gnome on X” and Opengl was back. Warning popups gone. I have no idea what not having Opengl would lose you, but there we are.