Xorg crashes at startup on Rocky 9.2 with Nvidia driver > 510.xx

I’m using Rocky 9.2 (5.14.0-284.11.1.el9_2.x86_64 kernel) with a Quadro P4000 using Xorg/lightdm (no Wayland) with no monitor connected and getting frequent (more than 50% of the time) Xorg crashes when the workstation is rebooted - Xorg repeatably crashes and core dumps (in a loop) - rebooting and Xorg may start up fine, but will invariably crash on startup after a subsequent reboot

When it gets into this state, nvidia-smi reports ‘No devices were found’ and dmesg reports (over and over again):

[  583.706631] NVRM: GPU 0000:02:00.0: RmInitAdapter failed! (0x25:0x65:1457)
[  583.706658] NVRM: GPU 0000:02:00.0: rm_init_adapter failed, device minor number 0
[  592.049415] NVRM: GPU 0000:02:00.0: RmInitAdapter failed! (0x25:0x65:1457)
[  592.049440] NVRM: GPU 0000:02:00.0: rm_init_adapter failed, device minor number 0
[  600.706487] NVRM: GPU 0000:02:00.0: RmInitAdapter failed! (0x25:0x65:1457)
[  600.706515] NVRM: GPU 0000:02:00.0: rm_init_adapter failed, device minor number 0
[  609.049496] NVRM: GPU 0000:02:00.0: RmInitAdapter failed! (0x25:0x65:1457)
[  609.049523] NVRM: GPU 0000:02:00.0: rm_init_adapter failed, device minor number 0

I’ve tried the latest 525, 530 and 535 series driver versions - all with the same problem - however everything appears to be stable with the last 510 series driver

Installing CentOS 7.9 and Rocky 8.7 on the same hardware with the latest 525 series driver appears to work OK

Any idea what may be causing this?

nvidia-bug-report.log.gz (249.9 KB)

I’ve update the BIOS on the workstation - and that appears to have fixed my issue …

I hit the same issue on a Lenovo ThinkPad P51 with a Quadro M2200 Mobile on latest BIOS. The only work-around I found was downgrading to 470.182.03.