Daemoon nvidia.persistenced doesn't launch with 5.19.0-1-amd64 kernel

Hello,
After updating my kernel to 5.19.0-1-amd64 (debian) I get this message :
Failed to query NVIDIA devices. Please ensure that the NVIDIA device files (/dev/nvidia*) exist, and that user 114 has read and write permissions for those files.

User 114 is the persistence daemon :
nvpd:x:114:121:NVIDIA Persistence Daemon,:/var/run/nvpd/:/usr/sbin/nologin

The directory or file /dev/nvidia doesnt exist :
find /dev/ -name ‘nvidia
/dev/nvidia-caps
/dev/nvidia-caps/nvidia-cap2
/dev/nvidia-caps/nvidia-cap1
/dev/nvidia-modeset
/dev/nvidia0
/dev/nvidiactl

When I start on 5.18.0-4-amd64 kernel everything is fine.

Nvidia : Driver 470.141.03.

Can anyone give me a clue or tell me if this error will be fixed with new driver please ?

I search on this forum I found this :

(but no solutions)

With regards,
Clement

1 Like

Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

Hello,
thank you for your answer and your help.
You will find in attachment the file requested.
With regards,
Clement
nvidia-bug-report.log.gz (70.0 KB)

The nvidia modules are installed but seem to be broken. I suspect due to a gcc mismatch. Your 5.19 kernel was compiled with gcc 11 but you have gcc 10 active. Please try installing/activating gcc 11.

I have purged “gcc-10” and the 12.2.0 version is now printed on the " gcc --version" command.
I have run again the script, you will find in attachment the result.
nvidia-bug-report.log.gz (70.7 KB)

I looked on the dmesg and one error occur before the line with “nvidia.persistenced failed”
A module “nvidia-current not found in directory /lib/modules/5.19.0-1-amd64” seem to be missing
dmesg.log (1.2 KB)

I guess you need to recmpile the modules using dkms now.
Run
dkms status
to get the currently installed version, then
sudo dkms remove --force nvidia/470.141.03
sudo dkms install nvidia/470.141.03
to recompile and
sudo modprobe nvidia
to check if the modules work.

I have try this :

dkms remove --force nvidia/470.141.03 -k ‘5.19.0-1-amd64’
it returns :
Error! The module/version combo: nvidia-470.141.03
is not located in the DKMS tree

so I search and try this :
dkms remove --force nvidia-current/470.141.03 -k ‘5.19.0-1-amd64’
Error! There is no instance of nvidia-current 470.141.03
for kernel 5.19.0-1-amd64 (x86_64) located in the DKMS tree.

Then I try :
dkms install nvidia-current/470.141.03 -k ‘5.19.0-1-amd64’
It’s starting but made errors. (see attachment line 538)
" gcc-11: error: unrecognized command-line option ‘-mharden-sls=all’ "

I search on internet and found this : linux - gcc: error: unrecognized command line option - Stack Overflow

But it doesn’t seem relevent to install “gcc-arm-linux-gnueabihf” and “gcc-aarch64-linux-gnu”

What do you think ?

And by the way, thank you very much for your help :)
make.log (45.9 KB)

You should first run
dkms status
to find the correct driver name and version.

Hey Generix,
the command ’ dkms status ’ print :
nvidia-current, 470.141.03, 5.18.0-4-amd64, x86_64: installed

this works fine on 5.18.0-4-amd64 kernel :
sudo dkms remove --force nvidia-current/470.141.03
sudo dkms install nvidia-current/470.141.03

I have also tried ’ dkms install nvidia-current/470.141.03 ’ after booting on 5.19.0-1-amd64 kernel without success.

I guess that 5.19 dependencies are not ok.

means the kernel you’re using was compiled with latest cpu bug mitigation, so you still need a newer gcc, currently, you’re using gcc 11.2 but you need gcc 11.3 or gcc 12.

Hello,
I have removed 5.19.0-1-amd64 kernel and header and remove gcc-11.
Then I have installed gcc-12.
I try to install 5.19.0-1-amd64 kernel and header again.
Unfortunately linux-headers-5.19.0-1-amd64 (witch is needed for nvidia driver) also need gcc-11 …
I have found and installed gcc-11.3 from bookworm repository and everything is fine !

Thank you very much Generix :)

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.