Hi all - first post here - been using CUDA for many years. Switching all our machines over to Rocky Linux and it has been going fine, but we’re trying to use a P5000 Quadra card with the 12.2 driver that installs, but nvidia-smi -l shows no devices found.
lspci | grep VID
09:00.0 VGA compatible controller: NVIDIA Corporation GP104GL [Quadro P5000] (rev a1)
09:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)
Any ideas? I tried the “fix_gpu_pass” script:
echo 1 > /sys/bus/pci/devices/0000:09:00.0/remove
echo 1 > /sys/bus/pci/rescan
That didn’t help.
nvidia-bug-report.log (1.2 MB)
Hello @joseph.obernberger1 and welcome to the NVIDIA developer forums:
Two things in the logs stand out:
[ 303.420848] nvidia: module verification failed: signature and/or required key missing - tainting kernel
This does not necessarily mean anything and most often can be ignored, but if you have secure boot enabled it might be worth trying to disable it, or if you have a MOK then use that to authenticate the NVIDIA kernel modules.
More concerning is this one:
[ 535.478017] NVRM: GPU 0000:09:00.0: RmInitAdapter failed! (0x31:0xffff:2502)
[ 535.478197] NVRM: GPU 0000:09:00.0: rm_init_adapter failed, device minor number 0
Which can be a result of the above, but more often points toward faulty or incompatible hardware. The only way to make sure is to check for the lates system BIOS as well as GPU vbios. And then try to run the GPU in a different, ideally Windows system to check if it still works.
Of course checking GPU seating in the PCIe slot as well as proper power supply and cooling should be the first checks.
I hope this helps!