I’m using Rocky 9.2 (5.14.0-284.11.1.el9_2.x86_64 kernel) with a Quadro P4000 using Xorg/lightdm (no Wayland) with no monitor connected and getting frequent (more than 50% of the time) Xorg crashes when the workstation is rebooted - Xorg repeatably crashes and core dumps (in a loop) - rebooting and Xorg may start up fine, but will invariably crash on startup after a subsequent reboot
When it gets into this state, nvidia-smi reports ‘No devices were found’ and dmesg reports (over and over again):
[ 583.706631] NVRM: GPU 0000:02:00.0: RmInitAdapter failed! (0x25:0x65:1457)
[ 583.706658] NVRM: GPU 0000:02:00.0: rm_init_adapter failed, device minor number 0
[ 592.049415] NVRM: GPU 0000:02:00.0: RmInitAdapter failed! (0x25:0x65:1457)
[ 592.049440] NVRM: GPU 0000:02:00.0: rm_init_adapter failed, device minor number 0
[ 600.706487] NVRM: GPU 0000:02:00.0: RmInitAdapter failed! (0x25:0x65:1457)
[ 600.706515] NVRM: GPU 0000:02:00.0: rm_init_adapter failed, device minor number 0
[ 609.049496] NVRM: GPU 0000:02:00.0: RmInitAdapter failed! (0x25:0x65:1457)
[ 609.049523] NVRM: GPU 0000:02:00.0: rm_init_adapter failed, device minor number 0
I’ve tried the latest 525, 530 and 535 series driver versions - all with the same problem - however everything appears to be stable with the last 510 series driver
Installing CentOS 7.9 and Rocky 8.7 on the same hardware with the latest 525 series driver appears to work OK
Any idea what may be causing this?
nvidia-bug-report.log.gz (249.9 KB)