Testing dual P600 on a Dell 5820 and Dell 7810 with five displays the card with the fifth display with randomly (no pattern found as of right now) fail and the BaseMosaic configuration will revert to a single display setup.
See attached nvidia-bug-report.log.gz
When this happens we’ll see the following in /var/log/messages:
... ... kernel: NVRM: RmInitAdapter failed! (0x26:0xffff:1114) ... kernel: NVRM: rm_init_adapter failed for device bearing minor number 1 ...
Does RmInitAdapter suggest this is faulty hardware, power, motherboard/PCI related?
And the following in /var/log/Xorg.0.log:
... (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:3:0:0. Please (EE) NVIDIA(GPU-0): check your system's kernel log for additional error (EE) NVIDIA(GPU-0): messages and refer to Chapter 8: Common Problems in the (EE) NVIDIA(GPU-0): README for additional information. (EE) NVIDIA(GPU-0): Failed to initialize one NVIDIA graphics device! (WW) NVIDIA(GPU-0): Failed to initialize Base Mosaic configuration. Reason: One (WW) NVIDIA(GPU-0): GPU failed to initialize; Only one GPU will be used for (WW) NVIDIA(GPU-0): this X screen. ...
Interestingly the PCI id listed in the Xorg.0.log file is the card with the active displays not the card with the display that fails to light up.
Logging out (restarting X) will sometimes bring the BaseMosic config back. However, after a couple login/logout attempts the same issue occurs and the same messages appear from above.
A couple things we’ve tried so far:
- nvidia-drm.modeset=1 when there are multiple GPUs will cause the machine to freeze once the GUI loads
- nomodeset was required during installation (via a USB anaconda kickstart) otherwise all displays would go black and enter power saving mode after some point of the system loading
- Not setting nomodeset appears to have the same effect as setting nomodeset
- The same two P600 cards (only ones we have) were tested in a Dell 5820 and a Dell 7810 - The 5820 has the tendency to freeze and a hard power cycle is required - The 7810 is a bit more forgiving when it comes to this issue and doesn't freeze like the 5820
- So far it doesn't appear to be WM related. We've seen the same issue happen when logging out of gnome-session and fvwm.
- Using a different xorg.conf (different modeline) and two NVS 510s we do not have any issues with the 5820 and the 7810 which might suggest this is a P600 or P-series related issue
nvidia-bug-report.log.gz (139 KB)