CENTOS 8 Stream: Failed to start NVIDIA persistence Daemon

Hello Forum,

I had NVIDIA driver running on my CENTOS 8 Stream machine for about 2.5 years without any problems now, also automatic each kernel-update went really well so far together with NVIDIA driver.
I installed the driver acc. this document 2.5 years ago:

But since yesterday after the latest automatic nvidia-driver-update (just started via “dnf update”) I am getting this error “Failed to start NVIDIA persistence Daemon” at start up (and I have only one monitor now with 600x800 resolution…).

When I type “nvidia-smi" I get: "NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.”

I have the bug report attached here as well-
nvidia-bug-report.sh (38.5 KB)

Kernel: 4.18.0-500.el8.x86_64
NVIDIA-cuda: 3:535.86.10-1.el8

I am a bit out of ideas now, and would appreciate any help I can get.

Thanks in advance.
Greetings
Jochen

Seems you uploaded the nvidia-bug-report.sh script itself. Not the log file it produces.

Sorry, here the real bug report attached.
nvidia-bug-report.log.gz (136.1 KB)

Any support welcome :-)

Thx in advance.
Greetings
Jochen

In file included from /var/lib/dkms/nvidia/535.86.10/build/nvidia-uvm/uvm_common.h:43,
from /var/lib/dkms/nvidia/535.86.10/build/nvidia-uvm/uvm_migrate.c:24:
/var/lib/dkms/nvidia/535.86.10/build/nvidia-uvm/uvm_linux.h:150:32: Fehler: expected identifier before numeric constant
define MPOL_PREFERRED_MANY 5
^
./include/uapi/linux/mempolicy.h:25:2: Anmerkung: bei Substitution des Makros »MPOL_PREFERRED_MANY«
MPOL_PREFERRED_MANY,
^~~~~~~~~~~~~~~~~~~
make[2]: *** [scripts/Makefile.build:317: /var/lib/dkms/nvidia/535.86.10/build/nvidia-uvm/uvm_migrate.o] Fehler 1
make[2]: *** Es wird auf noch nicht beendete Prozesse gewartet…
make[1]: *** [Makefile:1619: module/var/lib/dkms/nvidia/535.86.10/build] Fehler 2
make[1]: Verzeichnis „/usr/src/kernels/4.18.0-500.el8.x86_64“ wird verlassen
make: *** [Makefile:82: modules] Fehler 2

The driver modules failed to build. That is why persistence daemon cannot start, as there is no driver installed.
I think in that case you should talk to CentOS and include the make log from:
/var/lib/dkms/nvidia/535.86.10/build/make.log

Proposed workaround:
Downgrade to an older kernel version.

hmm, okay. Thank You for supporting :-)

Nevertheless I wanted to update to CENTOS 9 Stream anyway → I think this is now a good reason to do it^^.

And I am not sure If I will install NVIDIA driver agoin or if I will just stick to what is delivered with CENTOS…

Greetings
Jochen

I have no idea about CentOS, so I don’t know what they ship.
But of course you should install a driver, if you have a nvidia gpu.

A quick web search brought up this easy step by step guide.
It looks like the way to go…

Hi,all
I was able to resolve the issue using the following method.

  1. Download the Linux driver from the NVIDIA official website.
    NVIDIA-Linux-x86_64-*.run
    https://www.nvidia.com/Download/index.aspx
    The one I downloaded is as follows:
    NVIDIA-Linux-x86_64-535.104.05.run

  2. Extract the NVIDIA driver installer.
    ./NVIDIA-Linux-x86_64-*.run -x

  3. Move to the extracted directory.
    cd NVIDIA-Linux-x86_64-*

  4. Open the /kernel/nvidia-uvm/uvm_linux.h file.
    vi kernel/nvidia-uvm/uvm_linux.h

  5. Change the MPOL_PREFERRED_MANY on line 150 to NVIDIA_MPOL_PREFERRED_MANY or another name that does not conflict with other definitions.
    #define NVIDIA_MPOL_PREFERRED_MANY 5

  6. Save the changes and close the file.

  7. Stop the graphical desktop environment.
    sudo systemctl isolate multi-user.target

  8. Install the modified driver.
    sudo ./nvidia-installer
    Follow the on-screen instructions to proceed with the installation.

  9. Once the installation is complete, reboot the system.
    sudo reboot

  10. Verify that the NVIDIA driver is installed correctly.
    nvidia-smi

2 Likes

This worked for me.

Centos stream 8, nvidia-535.104.12, kernel 4.18.0-514.el8.x86_64

in my case the source tree was:

/usr/src/nvidia-535.104.12/nvidia-uvm/uvm_linux.h

Thank you kanemitsu!

1 Like

Likewise working, CentOS stream 8.

Edited /usr/src/nvidia-535.104.12/nvidia-uvm/uvm_linux.h

$ sudo dkms autoinstall

Thanks all!