Issues with cuda-12.6.0-1.x86_64 from RHEL8 repo

If using the RHEL 8 repo (Index of /compute/cuda/repos/rhel8/x86_64) and having no Nvidia/Cuda software installed and we try to just do “yum install cuda” it is not resolving dependencies. If we try to install the slightly older version cuda-12.5.1-1.x86_64 that runs fine. The error we get when installing cuda-12.6.0-1.x86_64 is

Error:
Problem: package nvidia-open-3:560.28.03-1.noarch from cuda-rhel8-x86_64 requires nvidia-kmod >= 3:560.28.03, but none of the providers can be installed

  • package nvidia-open-3:560.28.03-1.noarch from cuda-rhel8-x86_64 conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:560.28.03-1.el8.x86_64 from cuda-rhel8-x86_64
  • package cuda-12.6.0-1.x86_64 from cuda-rhel8-x86_64 requires nvidia-open >= 560.28.03, but none of the providers can be installed
  • cannot install the best candidate for the job
  • package kmod-nvidia-open-dkms-3:560.28.03-1.el8.x86_64 from cuda-rhel8-x86_64 is filtered out by modular filtering

Anyone else experiencing this? We are seeing this on multiple systems at multiple sites on servers with V100’s, A100’s, and H100’s.

Thanks

For what its worth I see exactly the same thing on RockLinux 9.4 (with kernel 5.14.0-427.26.1.el9_4.x86_64, not sure if that’s relevant), using:

dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
dnf module install nvidia-driver:latest-dkms
dnf install cuda

Again using cuda-12.5.1 appears to install OK.

I have also noticed though that the nvidia-driver module doesn’t behave as documented wrt kernel- packages, see Nvidia-driver installs kernel-{core,devel} even if correct versions installed [edited to add link]

Same here, I have a similar Rocky 8 setup with cuda already installed. Trying to update we get:

Error: 
 Problem 1: package nvidia-open-3:560.28.03-1.noarch from cuda-rhel8-x86_64 conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:560.28.03-1.el8.x86_64 from @System
  - cannot install the best update candidate for package kmod-nvidia-latest-dkms-3:560.28.03-1.el8.x86_64
  - cannot install the best update candidate for package cuda-drivers-555.42.06-1.x86_64
 Problem 2: package nvidia-open-3:560.28.03-1.noarch from cuda-rhel8-x86_64 requires nvidia-kmod >= 3:560.28.03, but none of the providers can be installed
  - package nvidia-open-3:560.28.03-1.noarch from cuda-rhel8-x86_64 conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:560.28.03-1.el8.x86_64 from @System
  - package nvidia-open-3:560.28.03-1.noarch from cuda-rhel8-x86_64 conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:560.28.03-1.el8.x86_64 from cuda-rhel8-x86_64
  - package cuda-12.6.0-1.x86_64 from cuda-rhel8-x86_64 requires nvidia-open >= 560.28.03, but none of the providers can be installed
  - cannot install the best update candidate for package cuda-12.5.1-1.x86_64
  - package kmod-nvidia-open-dkms-3:560.28.03-1.el8.x86_64 from cuda-rhel8-x86_64 is filtered out by modular filtering

Yes, I hit this too, with Fedora 39 doing a dnf --refresh upgrade with the latest 12.5 update and its accompanying driver. (The normal way to pick up system and software updates for anyone not familiar). The last ok dnf upgrade upgraded cuda 12.5 correctly. Because I didn’t read the notes properly I manually installed the 560.28.03 open driver which wouldn’t have helped as I have a pascal GPU. Having got out of that I did a dnf install of the closed source driver and the driver was ok, at least. On doing, dnf upgrade again it complained that it had a problem doing cuda and the driver as observed by others. However I told it to skip these 2 and as far as I could tell all the cuda components got updated. I can compile, run nsys (not the current version of compute as its pascal) nvvp etc.
So for my use case it seems ok but… Also it dnf needs telling to skip every time so it needs care. I don’t have my
notes handy for the driver install but I thought it was worth a reply anyway. Hope its of use.

Sorry I probably wasn’t very clear. I was using 12.5 update 1 successfully, which had upgraded using dnf correctly (with no intervention) from 12.5 release. The errors appeared when trying to accept the upgrade to 12.6. Apologies.

Attempting an update on a Fedora 39 box:

$ sudo dnf update --refresh
cuda-fedora39-x86_64                                                                                                                                                  116 kB/s | 3.5 kB     00:00
Fedora 39 - x86_64                                                                                                                                                     71 kB/s |  29 kB     00:00
Fedora 39 openh264 (From Cisco) - x86_64                                                                                                                              3.8 kB/s | 989  B     00:00
Fedora 39 - x86_64 - Updates                                                                                                                                           88 kB/s |  28 kB     00:00
Dependencies resolved.

 Problem 1: package nvidia-open-3:560.28.03-1.noarch from cuda-fedora39-x86_64 conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:560.28.03-1.fc39.x86_64 from @System
  - cannot install the best update candidate for package kmod-nvidia-latest-dkms-3:560.28.03-1.fc39.x86_64
  - cannot install the best update candidate for package cuda-drivers-3:560.28.03-1.x86_64
 Problem 2: package nvidia-open-3:560.28.03-1.noarch from cuda-fedora39-x86_64 requires nvidia-kmod >= 3:560.28.03, but none of the providers can be installed
  - package nvidia-open-3:560.28.03-1.noarch from cuda-fedora39-x86_64 conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:560.28.03-1.fc39.x86_64 from @System
  - package nvidia-open-3:560.28.03-1.noarch from cuda-fedora39-x86_64 conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:560.28.03-1.fc39.x86_64 from cuda-fedora39-x86_64
  - package cuda-12.6.0-1.x86_64 from cuda-fedora39-x86_64 requires nvidia-open >= 560.28.03, but none of the providers can be installed
  - cannot install the best update candidate for package cuda-12.5.1-1.x86_64
  - package kmod-nvidia-open-dkms-3:560.28.03-1.fc39.x86_64 from cuda-fedora39-x86_64 is filtered out by modular filtering
======================================================================================================================================================================================================
 Package                                       Architecture                             Version                                          Repository                                              Size
======================================================================================================================================================================================================
Skipping packages with conflicts:
(add '--best --allowerasing' to command line to force their upgrade):
 nvidia-open                                   noarch                                   3:560.28.03-1                                    cuda-fedora39-x86_64                                   7.9 k
Skipping packages with broken dependencies:
 cuda                                          x86_64                                   12.6.0-1                                         cuda-fedora39-x86_64                                   7.3 k

Transaction Summary
======================================================================================================================================================================================================
Skip  2 Packages

Attempting and update on Rocky Linux 8:

$ sudo dnf update --refresh
Rocky Linux 8 - AppStream                                                                                                                                      7.3 kB/s | 4.8 kB     00:00
Rocky Linux 8 - BaseOS                                                                                                                                         7.1 kB/s | 4.3 kB     00:00
Rocky Linux 8 - Extras                                                                                                                                         4.1 kB/s | 3.1 kB     00:00
Rocky Linux 8 - PowerTools                                                                                                                                     7.7 kB/s | 4.8 kB     00:00
cuda-rhel8-x86_64                                                                                                                                               13 kB/s | 3.5 kB     00:00
Docker CE Stable - x86_64                                                                                                                                       11 kB/s | 3.5 kB     00:00
Extra Packages for Enterprise Linux 8 - x86_64                                                                                                                  45 kB/s |  33 kB     00:00
NVIDIA HPC SDK                                                                                                                                                  11 kB/s | 3.0 kB     00:00
libnvidia-container                                                                                                                                            1.4 kB/s | 833  B     00:00
RPM Fusion for EL 8 - Free - Updates                                                                                                                           6.8 kB/s | 3.7 kB     00:00
RPM Fusion for EL 8 - Nonfree - Updates                                                                                                                        5.4 kB/s | 3.7 kB     00:00
runner_gitlab-runner                                                                                                                                           1.2 kB/s | 1.0 kB     00:00
runner_gitlab-runner-source                                                                                                                                    1.0 kB/s | 951  B     00:00
packages.microsoft.com                                                                                                                                         672  B/s | 481  B     00:00
Error:
 Problem 1: package nvidia-open-3:560.28.03-1.noarch from cuda-rhel8-x86_64 conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:560.28.03-1.el8.x86_64 from @System
  - cannot install the best update candidate for package kmod-nvidia-latest-dkms-3:560.28.03-1.el8.x86_64
  - cannot install the best update candidate for package cuda-drivers-3:560.28.03-1.x86_64
 Problem 2: package nvidia-open-3:560.28.03-1.noarch from cuda-rhel8-x86_64 requires nvidia-kmod >= 3:560.28.03, but none of the providers can be installed
  - package nvidia-open-3:560.28.03-1.noarch from cuda-rhel8-x86_64 conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:560.28.03-1.el8.x86_64 from @System
  - package nvidia-open-3:560.28.03-1.noarch from cuda-rhel8-x86_64 conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:560.28.03-1.el8.x86_64 from cuda-rhel8-x86_64
  - package cuda-12.6.0-1.x86_64 from cuda-rhel8-x86_64 requires nvidia-open >= 560.28.03, but none of the providers can be installed
  - cannot install the best update candidate for package cuda-12.5.1-1.x86_64
  - package kmod-nvidia-open-dkms-3:560.28.03-1.el8.x86_64 from cuda-rhel8-x86_64 is filtered out by modular filtering
(try to add '--allowerasing' to command line to replace conflicting packages or '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)

It appears that some dependencies and packages have been changed as of this morning. If you run yum/dnf install cuda on Alma 8 it installs cuda 12.6 successfully. You have to make sure you have removed the kmod-nvidia-latest-dkms package as it appears that nvidia has replaced it with the kmod-nvidia-open-dkms package.

dnf remove kmod-nvidia-open-dkms
dnf install cuda

Also worth noting, the open drivers don’t support cards Volta or older. We currently support several nodes that have V100s in them. Looking for a work around or alternative to my suggestion above that uses the proprietary drivers or requesting support for Voltas in the open source driver.

Update: it seems that installing cuda-toolkit and nvidia-driver-cuda (where it installs the proprietary drivers instead of the open ones) works for our application.

Hi bispcy,

Could you give details on how you installed the V100 driver. Doing a:

dnf install cuda-toolkit nvidia-driver-cuda

still installs the nvidia-open package.
Cheers.

I figured it out.
1st go to the download page and select the V100 driver.
Install the driver, in my case:

dnf install ./nvidia-driver-local-repo-rhel8-535.183.06-1.0-1.x86_64.rpm

Then, remove the old cuda if you have it installed and reset the repo module streams.

dnf remove cuda-toolkit nvidia-driver-cuda
dnf module reset nvidia-driver

Then install dkms and cuda

dnf module install nvidia-driver:latest-dkms
dnf install cuda-toolkit nvidia-driver-cuda

when I ran into this I did this:

because I have a pascal card and had wrongly added the open version I had to use --allowerasing

sudo dnf module install nvidia-driver:latest-dkms --allowerasing
result:

Installed:
kmod-nvidia-latest-dkms-3:560.28.03-1.fc39.x86_64
Removed:
kmod-nvidia-open-dkms-3:560.28.03-1.fc39.x86_64
Complete!

I need to add this modeset on every driver change as otherwise I get the black screen problem on reboot

sudo grubby --update-kernel=ALL --args=“nvidia-drm.modeset=1”

reboot

And that left me with a working cuda 12.6 as per my post of Aug 15, but with these complaintsevery time with dnf upgrade::
Dependencies resolved.

Problem 1: package nvidia-open-3:560.28.03-1.noarch from cuda-fedora39-x86_64 conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:560.28.03-1.fc39.x86_64 from @System

  • cannot install the best update candidate for package kmod-nvidia-latest-dkms-3:560.28.03-1.fc39.x86_64
  • cannot install the best update candidate for package cuda-drivers-3:560.28.03-1.x86_64
    Problem 2: package nvidia-open-3:560.28.03-1.noarch from cuda-fedora39-x86_64 requires nvidia-kmod >= 3:560.28.03, but none of the providers can be installed
  • package nvidia-open-3:560.28.03-1.noarch from cuda-fedora39-x86_64 conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:560.28.03-1.fc39.x86_64 from @System
  • package nvidia-open-3:560.28.03-1.noarch from cuda-fedora39-x86_64 conflicts with kmod-nvidia-latest-dkms provided by kmod-nvidia-latest-dkms-3:560.28.03-1.fc39.x86_64 from cuda-fedora39-x86_64
  • package cuda-12.6.0-1.x86_64 from cuda-fedora39-x86_64 requires nvidia-open >= 560.28.03, but none of the providers can be installed
  • cannot install the best update candidate for package cuda-12.5.1-1.x86_64
  • package kmod-nvidia-open-dkms-3:560.28.03-1.fc39.x86_64 from cuda-fedora39-x86_64 is filtered out by modular filtering
    ================================================================================
    Package Arch Version Repository Size
    ================================================================================
    Upgrading:
    Box2D x86_64 2.4.2-1.fc39 updates 108 k
    Skipping packages with conflicts:
    (add ‘–best --allowerasing’ to command line to force their upgrade):
    nvidia-open noarch 3:560.28.03-1 cuda-fedora39-x86_64 7.9 k
    Skipping packages with broken dependencies:
    cuda x86_64 12.6.0-1 cuda-fedora39-x86_64 7.3 k

Box2D

When driver upgraded to 560.35.03 the complaints went down to:
Problem: package cuda-12.6.0-1.x86_64 from cuda-fedora39-x86_64 requires nvidia-open >= 560.28.03, but none of the providers can be installed

  • cannot install the best update candidate for package cuda-12.5.1-1.x86_64
  • package nvidia-open-3:560.28.03-1.noarch from cuda-fedora39-x86_64 is filtered out by modular filtering
  • package nvidia-open-3:560.35.03-1.noarch from cuda-fedora39-x86_64 is filtered out by modular filtering
    ================================================================================
    Package Architecture Version Repository Size
    ================================================================================
    Skipping packages with broken dependencies:
    cuda x86_64 12.6.0-1 cuda-fedora39-x86_64 7.3 k

However I was not entirely comfortable with this so today I did:
dnf remove cuda
#which removed everything but the drivers.
then:
dnf install cuda-toolkit

All complaints gone on dnf upgrade. cuda 12.6 works as before.
I will sort out drivers for gpu myself in future.

Hope this helps

Tried unpinning from cuda 12.5 and installing the latest cuda package, given the comments above, but still hit

    openstack.openhpc-cuda: Depsolve Error occurred: 
    openstack.openhpc-cuda:  Problem: package cuda-12.6.1-1.x86_64 from cuda-rhel9-x86_64 requires nvidia-open >= 560.35.03, but none of the providers can be installed
    openstack.openhpc-cuda:   - cannot install the best candidate for the job
    openstack.openhpc-cuda:   - package nvidia-open-3:560.28.03-1.noarch from cuda-rhel9-x86_64 is filtered out by modular filtering
    openstack.openhpc-cuda:   - package nvidia-open-3:560.35.03-1.noarch from cuda-rhel9-x86_64 is filtered out by modular filtering

On RockyLinux9