X window system fails to initialize in multi-GPU setup.

I have a setup with multiple GPU’s (2x 980Ti). When the primary monitor is attached to the secondary GPU (as determined by the linux kernel / nVidia module to be ‘GPU-1’, which actually currently happens to be the ‘top’ GPU and depends on the motherboard as well, as with the /dev/sda, /dev/sdb… thing for HDD’s), then the X-window system will fail to load.

This problem was introduced with either the 4.9 or 4.10 series kernels.

OS:

> uname -a
Linux [pcname] 4.10.9-1-ARCH #1 SMP PREEMPT Sat Apr 8 12:39:59 CEST 2017 x86_64 GNU/Linux

All relevant configurations are left at their defaults. For what it’s worth, I’ve installed the X-window system as windowing system, and use kddm with a KDE plasma desktop as my desktop. For this problem, only the X system should be relevant (not the stuff on top), as the boot process stops there.

Excerpts from the Xorg log:

[     5.518] (II) systemd-logind: logind integration requires -keeptty and -keeptty was not provided, disabling logind integration
[     5.519] (II) xfree86: Adding drm device (/dev/dri/card0)
[     5.519] (II) xfree86: Adding drm device (/dev/dri/card1)
[     5.521] (**) OutputClass "nvidia" ModulePath extended to "/usr/lib/nvidia/xorg,/usr/lib/xorg/modules"
[     5.521] (**) OutputClass "nvidia" ModulePath extended to "/usr/lib/nvidia/xorg,/usr/lib/nvidia/xorg,/usr/lib/xorg/modules"
[     5.521] (**) OutputClass "nvidia" setting /dev/dri/card0 as PrimaryGPU
[     5.523] (--) PCI:*(0:2:0:0) 10de:17c8:1043:8546 rev 161, Mem @ 0xfa000000/16777216, 0xe0000000/268435456, 0xf0000000/33554432, I/O @ 0x0000e000/128, BIOS @ 0x????????/524288
[     5.523] (--) PCI: (0:3:0:0) 10de:17c8:1043:8546 rev 161, Mem @ 0xf8000000/16777216, 0xc0000000/268435456, 0xd0000000/33554432, I/O @ 0x0000d000/128, BIOS @ 0x????????/131072
[     5.523] (WW) Open ACPI failed (/var/run/acpid.socket) (No such file or directory)
[     5.523] (II) LoadModule: "glx"
[     5.524] (II) Loading /usr/lib/nvidia/xorg/libglx.so
[     5.537] (II) Module glx: vendor="NVIDIA Corporation"
[     5.537] 	compiled for 4.0.2, module version = 1.0.0
[     5.537] 	Module class: X.Org Server Extension
[     5.537] (II) NVIDIA GLX Module  378.13  Tue Feb  7 18:25:34 PST 2017
[     5.538] (II) LoadModule: "nvidia"
[     5.539] (II) Loading /usr/lib/xorg/modules/drivers/nvidia_drv.so
[     5.541] (II) Module nvidia: vendor="NVIDIA Corporation"
[     5.541] 	compiled for 4.0.2, module version = 1.0.0
[     5.541] 	Module class: X.Org Video Driver
[     5.541] (II) NVIDIA dlloader X Driver  378.13  Tue Feb  7 18:01:51 PST 2017
[     5.541] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[     5.541] (II) Loading sub module "fb"
[     5.541] (II) LoadModule: "fb"
[     5.542] (II) Loading /usr/lib/xorg/modules/libfb.so
[     5.542] (II) Module fb: vendor="X.Org Foundation"
[     5.542] 	compiled for 1.19.3, module version = 1.0.0
[     5.542] 	ABI class: X.Org ANSI C Emulation, version 0.4
[     5.542] (II) Loading sub module "wfb"
[     5.542] (II) LoadModule: "wfb"
[     5.542] (II) Loading /usr/lib/xorg/modules/libwfb.so
[     5.543] (II) Module wfb: vendor="X.Org Foundation"
[     5.543] 	compiled for 1.19.3, module version = 1.0.0
[     5.543] 	ABI class: X.Org ANSI C Emulation, version 0.4
[     5.543] (II) Loading sub module "ramdac"
[     5.543] (II) LoadModule: "ramdac"
[     5.543] (II) Module "ramdac" already built-in
[     5.544] (**) NVIDIA(0): Depth 24, (--) framebuffer bpp 32
[     5.544] (==) NVIDIA(0): RGB weight 888
[     5.544] (==) NVIDIA(0): Default visual is TrueColor
[     5.544] (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
[     5.545] (II) Applying OutputClass "nvidia" options to /dev/dri/card0
[     5.545] (**) NVIDIA(0): Option "SLI" "OFF"
[     5.545] (**) NVIDIA(0): Option "ConnectToAcpid" "True"
[     5.545] (**) NVIDIA(0): Option "AllowEmptyInitialConfiguration"
[     5.545] (**) NVIDIA(0): NVIDIA SLI disabled.
[     5.545] (**) NVIDIA(0): Enabling 2D acceleration
[     6.257] (--) NVIDIA(0): Valid display device(s) on GPU-0 at PCI:2:0:0
[     6.257] (--) NVIDIA(0):     CRT-0
[     6.257] (--) NVIDIA(0):     DFP-0
[     6.257] (--) NVIDIA(0):     DFP-1
[     6.257] (--) NVIDIA(0):     DFP-2
[     6.257] (--) NVIDIA(0):     DFP-3
[     6.257] (--) NVIDIA(0):     DFP-4
[     6.257] (--) NVIDIA(0):     DFP-5
[     6.257] (--) NVIDIA(0):     DFP-6
[     6.257] (--) NVIDIA(0):     DFP-7
[     6.258] (II) NVIDIA(0): NVIDIA GPU GeForce GTX 980 Ti (GM200-A) at PCI:2:0:0 (GPU-0)
[     6.258] (--) NVIDIA(0): Memory: 6291456 kBytes
[     6.258] (--) NVIDIA(0): VideoBIOS: 84.00.32.00.0a
[     6.258] (II) NVIDIA(0): Detected PCI Express Link width: 16X
[     6.260] (--) NVIDIA(GPU-0): CRT-0: disconnected
[     6.260] (--) NVIDIA(GPU-0): CRT-0: 400.0 MHz maximum pixel clock
[     6.260] (--) NVIDIA(GPU-0): 
[     6.263] (--) NVIDIA(GPU-0): DFP-0: disconnected
[     6.263] (--) NVIDIA(GPU-0): DFP-0: Internal TMDS
[     6.263] (--) NVIDIA(GPU-0): DFP-0: 330.0 MHz maximum pixel clock
[     6.263] (--) NVIDIA(GPU-0): 
[     6.263] (--) NVIDIA(GPU-0): DFP-1: disconnected
[     6.263] (--) NVIDIA(GPU-0): DFP-1: Internal TMDS
[     6.263] (--) NVIDIA(GPU-0): DFP-1: 165.0 MHz maximum pixel clock
[     6.263] (--) NVIDIA(GPU-0): 
[     6.263] (--) NVIDIA(GPU-0): DFP-2: disconnected
[     6.263] (--) NVIDIA(GPU-0): DFP-2: Internal DisplayPort
[     6.263] (--) NVIDIA(GPU-0): DFP-2: 960.0 MHz maximum pixel clock
[     6.263] (--) NVIDIA(GPU-0): 
[     6.263] (--) NVIDIA(GPU-0): DFP-3: disconnected
[     6.263] (--) NVIDIA(GPU-0): DFP-3: Internal TMDS
[     6.263] (--) NVIDIA(GPU-0): DFP-3: 165.0 MHz maximum pixel clock
[     6.263] (--) NVIDIA(GPU-0): 
[     6.263] (--) NVIDIA(GPU-0): DFP-4: disconnected
[     6.263] (--) NVIDIA(GPU-0): DFP-4: Internal DisplayPort
[     6.263] (--) NVIDIA(GPU-0): DFP-4: 960.0 MHz maximum pixel clock
[     6.263] (--) NVIDIA(GPU-0): 
[     6.263] (--) NVIDIA(GPU-0): DFP-5: disconnected
[     6.263] (--) NVIDIA(GPU-0): DFP-5: Internal TMDS
[     6.263] (--) NVIDIA(GPU-0): DFP-5: 165.0 MHz maximum pixel clock
[     6.263] (--) NVIDIA(GPU-0): 
[     6.263] (--) NVIDIA(GPU-0): DFP-6: disconnected
[     6.263] (--) NVIDIA(GPU-0): DFP-6: Internal DisplayPort
[     6.263] (--) NVIDIA(GPU-0): DFP-6: 960.0 MHz maximum pixel clock
[     6.263] (--) NVIDIA(GPU-0): 
[     6.263] (--) NVIDIA(GPU-0): DFP-7: disconnected
[     6.263] (--) NVIDIA(GPU-0): DFP-7: Internal TMDS
[     6.263] (--) NVIDIA(GPU-0): DFP-7: 165.0 MHz maximum pixel clock
[     6.263] (--) NVIDIA(GPU-0): 
[     6.263] (==) NVIDIA(0): 
[     6.263] (==) NVIDIA(0): No modes were requested; the default mode "nvidia-auto-select"
[     6.263] (==) NVIDIA(0):     will be used as the requested mode.
[     6.263] (==) NVIDIA(0): 
[     6.263] (--) NVIDIA(0): No enabled display devices found; starting anyway because
[     6.263] (--) NVIDIA(0):     AllowEmptyInitialConfiguration is enabled
[     6.264] (II) NVIDIA(0): Validated MetaModes:
[     6.264] (II) NVIDIA(0):     "NULL"
[     6.264] (II) NVIDIA(0): Virtual screen size determined to be 640 x 480
[     6.264] (WW) NVIDIA(0): Unable to get display device for DPI computation.
[     6.264] (==) NVIDIA(0): DPI set to (75, 75); computed from built-in default
[     6.264] (--) Depth 24 pixmap format is 32 bpp
[     6.978] (--) NVIDIA(0): Valid display device(s) on GPU-1 at PCI:3:0:0
[     6.978] (--) NVIDIA(0):     CRT-0
[     6.978] (--) NVIDIA(0):     DFP-0
[     6.978] (--) NVIDIA(0):     DFP-1
[     6.978] (--) NVIDIA(0):     DFP-2
[     6.978] (--) NVIDIA(0):     DFP-3
[     6.978] (--) NVIDIA(0):     DFP-4
[     6.978] (--) NVIDIA(0):     DFP-5
[     6.978] (--) NVIDIA(0):     DFP-6 (boot)
[     6.978] (--) NVIDIA(0):     DFP-7
[     6.982] (--) NVIDIA(GPU-1): CRT-0: disconnected
[     6.982] (--) NVIDIA(GPU-1): CRT-0: 400.0 MHz maximum pixel clock
[     6.982] (--) NVIDIA(GPU-1): 
[     6.984] (--) NVIDIA(GPU-1): DFP-0: disconnected
[     6.984] (--) NVIDIA(GPU-1): DFP-0: Internal TMDS
[     6.984] (--) NVIDIA(GPU-1): DFP-0: 330.0 MHz maximum pixel clock
[     6.984] (--) NVIDIA(GPU-1): 
[     6.984] (--) NVIDIA(GPU-1): DFP-1: disconnected
[     6.984] (--) NVIDIA(GPU-1): DFP-1: Internal TMDS
[     6.984] (--) NVIDIA(GPU-1): DFP-1: 165.0 MHz maximum pixel clock
[     6.984] (--) NVIDIA(GPU-1): 
[     6.984] (--) NVIDIA(GPU-1): DFP-2: disconnected
[     6.984] (--) NVIDIA(GPU-1): DFP-2: Internal DisplayPort
[     6.984] (--) NVIDIA(GPU-1): DFP-2: 960.0 MHz maximum pixel clock
[     6.984] (--) NVIDIA(GPU-1): 
[     6.984] (--) NVIDIA(GPU-1): DFP-3: disconnected
[     6.984] (--) NVIDIA(GPU-1): DFP-3: Internal TMDS
[     6.984] (--) NVIDIA(GPU-1): DFP-3: 165.0 MHz maximum pixel clock
[     6.984] (--) NVIDIA(GPU-1): 
[     6.984] (--) NVIDIA(GPU-1): DFP-4: disconnected
[     6.984] (--) NVIDIA(GPU-1): DFP-4: Internal DisplayPort
[     6.984] (--) NVIDIA(GPU-1): DFP-4: 960.0 MHz maximum pixel clock
[     6.984] (--) NVIDIA(GPU-1): 
[     6.984] (--) NVIDIA(GPU-1): DFP-5: disconnected
[     6.984] (--) NVIDIA(GPU-1): DFP-5: Internal TMDS
[     6.984] (--) NVIDIA(GPU-1): DFP-5: 165.0 MHz maximum pixel clock
[     6.984] (--) NVIDIA(GPU-1): 
[     6.987] (--) NVIDIA(GPU-1): Ancor Communications Inc ASUS PB287Q (DFP-6): connected
[     6.987] (--) NVIDIA(GPU-1): Ancor Communications Inc ASUS PB287Q (DFP-6): Internal DisplayPort
[     6.987] (--) NVIDIA(GPU-1): Ancor Communications Inc ASUS PB287Q (DFP-6): 960.0 MHz maximum pixel clock
[     6.987] (--) NVIDIA(GPU-1): 
[     6.988] (--) NVIDIA(GPU-1): DFP-7: disconnected
[     6.988] (--) NVIDIA(GPU-1): DFP-7: Internal TMDS
[     6.988] (--) NVIDIA(GPU-1): DFP-7: 165.0 MHz maximum pixel clock
[     6.988] (--) NVIDIA(GPU-1): 
[     7.061] (II) NVIDIA(GPU-1): NVIDIA GPU GeForce GTX 980 Ti (GM200-A) at PCI:3:0:0 (GPU-1)
[     7.062] (--) NVIDIA(GPU-1): Memory: 6291456 kBytes
[     7.062] (--) NVIDIA(GPU-1): VideoBIOS: 84.00.32.00.0a
[     7.062] (II) NVIDIA(GPU-1): Detected PCI Express Link width: 16X
[     7.063] (II) NVIDIA: Using 12288.00 MB of virtual memory for indirect memory
[     7.063] (II) NVIDIA:     access.
[     7.068] (II) NVIDIA(0): ACPI: failed to connect to the ACPI event daemon; the daemon
[     7.068] (II) NVIDIA(0):     may not be running or the "AcpidSocketPath" X
[     7.068] (II) NVIDIA(0):     configuration option may not be set correctly.  When the
[     7.068] (II) NVIDIA(0):     ACPI event daemon is available, the NVIDIA X driver will
[     7.068] (II) NVIDIA(0):     try to use it to receive ACPI event notifications.  For
[     7.068] (II) NVIDIA(0):     details, please see the "ConnectToAcpid" and
[     7.068] (II) NVIDIA(0):     "AcpidSocketPath" X configuration options in Appendix B: X
[     7.068] (II) NVIDIA(0):     Config Options in the README.
[     7.120] (II) NVIDIA(0): Setting mode "NULL"
[     7.128] (==) NVIDIA(0): Disabling shared memory pixmaps
[     7.128] (==) NVIDIA(0): Backing store enabled
[     7.128] (==) NVIDIA(0): Silken mouse enabled
[     7.129] (**) NVIDIA(0): DPMS enabled
[     7.129] (WW) NVIDIA(0): Option "PrimaryGPU" is not used
[     7.129] (II) Loading sub module "dri2"
[     7.129] (II) LoadModule: "dri2"
[     7.129] (II) Module "dri2" already built-in
[     7.129] (II) NVIDIA(0): [DRI2] Setup complete
[     7.129] (II) NVIDIA(0): [DRI2]   VDPAU driver: nvidia
[     7.129] (--) RandR disabled
[     7.133] (II) Initializing extension GLX
[     7.133] (II) Indirect GLX disabled.
[     7.211] (II) config/udev: Adding input device Power Button (/dev/input/event7)
[     7.211] (**) Power Button: Applying InputClass "evdev keyboard catchall"
[     7.211] (**) Power Button: Applying InputClass "libinput keyboard catchall"
[     7.211] (II) LoadModule: "libinput"
[     7.211] (II) Loading /usr/lib/xorg/modules/input/libinput_drv.so
[     7.214] (II) Module libinput: vendor="X.Org Foundation"
[     7.214] 	compiled for 1.19.2, module version = 0.25.0
[     7.214] 	Module class: X.Org XInput Driver
[     7.214] 	ABI class: X.Org XInput driver, version 24.1
[     7.214] (II) Using input driver 'libinput' for 'Power Button'
[     7.214] (**) Power Button: always reports core events
...
Lots of lines about keyboard and mouse
...

[     7.489] (--) NVIDIA(GPU-1): Ancor Communications Inc ASUS PB287Q (DFP-6): connected
[     7.489] (--) NVIDIA(GPU-1): Ancor Communications Inc ASUS PB287Q (DFP-6): Internal DisplayPort
[     7.489] (--) NVIDIA(GPU-1): Ancor Communications Inc ASUS PB287Q (DFP-6): 960.0 MHz maximum pixel clock
[     7.489] (--) NVIDIA(GPU-1):

At this point the X log simply stops, and X stops executing. What’s interesting is that at the same time, it’s not using any additional system resources.

On my system, there are at one time 5 or 6 Xorg log files. I’m not really sure which is which at this point, as the naming convention is arcane. (Xorg.0.log, Xorg.0.log.old, Xorg.1.log, etc.). The other old log file contains this tidbit:

[   107.100] (II) NVIDIA(0): NVIDIA GPU GeForce GTX 980 Ti (GM200-A) at PCI:3:0:0 (GPU-0)
[   107.100] (--) NVIDIA(0): Memory: 6291456 kBytes
[   107.100] (--) NVIDIA(0): VideoBIOS: 84.00.32.00.0a
[   107.100] (II) NVIDIA(0): Detected PCI Express Link width: 16X
[   107.100] (EE) NVIDIA(GPU-0): Failed to acquire modesetting permission.
[   107.100] (EE) NVIDIA(0): Failing initialization of X screen 0
[   107.113] (II) UnloadModule: "nvidia"
[   107.113] (II) UnloadSubModule: "wfb"
[   107.113] (II) UnloadSubModule: "fb"
[   107.113] (EE) Screen(s) found, but none have a usable configuration.
[   107.113] (EE) 
Fatal server error:
[   107.113] (EE) no screens found(EE)

Where apparently it seems the NVIDIA module ‘gave up’ after 107 seconds and failed to initialize the X system. Unfortunately it leaves that TTY session in an unuseable state, not responding to any input in any noticeable way.

Here’s what happens when the monitor is attached to the ‘primary’ GPU:

[     5.524] (II) systemd-logind: logind integration requires -keeptty and -keeptty was not provided, disabling logind integration
[     5.524] (II) xfree86: Adding drm device (/dev/dri/card0)
[     5.524] (II) xfree86: Adding drm device (/dev/dri/card1)
[     5.529] (**) OutputClass "nvidia" ModulePath extended to "/usr/lib/nvidia/xorg,/usr/lib/xorg/modules"
[     5.529] (**) OutputClass "nvidia" ModulePath extended to "/usr/lib/nvidia/xorg,/usr/lib/nvidia/xorg,/usr/lib/xorg/modules"
[     5.529] (**) OutputClass "nvidia" setting /dev/dri/card0 as PrimaryGPU
[     5.533] (--) PCI:*(0:2:0:0) 10de:17c8:1043:8546 rev 161, Mem @ 0xfa000000/16777216, 0xe0000000/268435456, 0xf0000000/33554432, I/O @ 0x0000e000/128, BIOS @ 0x????????/524288
[     5.533] (--) PCI: (0:3:0:0) 10de:17c8:1043:8546 rev 161, Mem @ 0xf8000000/16777216, 0xc0000000/268435456, 0xd0000000/33554432, I/O @ 0x0000d000/128, BIOS @ 0x????????/131072
[     5.533] (WW) Open ACPI failed (/var/run/acpid.socket) (No such file or directory)
[     5.533] (II) LoadModule: "glx"
[     5.534] (II) Loading /usr/lib/nvidia/xorg/libglx.so
[     5.549] (II) Module glx: vendor="NVIDIA Corporation"
[     5.549] 	compiled for 4.0.2, module version = 1.0.0
[     5.549] 	Module class: X.Org Server Extension
[     5.549] (II) NVIDIA GLX Module  378.13  Tue Feb  7 18:25:34 PST 2017
[     5.549] (II) LoadModule: "nvidia"
[     5.550] (II) Loading /usr/lib/xorg/modules/drivers/nvidia_drv.so
[     5.553] (II) Module nvidia: vendor="NVIDIA Corporation"
[     5.553] 	compiled for 4.0.2, module version = 1.0.0
[     5.553] 	Module class: X.Org Video Driver
[     5.553] (II) NVIDIA dlloader X Driver  378.13  Tue Feb  7 18:01:51 PST 2017
[     5.553] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[     5.554] (II) Loading sub module "fb"
[     5.554] (II) LoadModule: "fb"
[     5.554] (II) Loading /usr/lib/xorg/modules/libfb.so
[     5.555] (II) Module fb: vendor="X.Org Foundation"
[     5.555] 	compiled for 1.19.3, module version = 1.0.0
[     5.555] 	ABI class: X.Org ANSI C Emulation, version 0.4
[     5.555] (II) Loading sub module "wfb"
[     5.555] (II) LoadModule: "wfb"
[     5.555] (II) Loading /usr/lib/xorg/modules/libwfb.so
[     5.555] (II) Module wfb: vendor="X.Org Foundation"
[     5.555] 	compiled for 1.19.3, module version = 1.0.0
[     5.555] 	ABI class: X.Org ANSI C Emulation, version 0.4
[     5.556] (II) Loading sub module "ramdac"
[     5.556] (II) LoadModule: "ramdac"
[     5.556] (II) Module "ramdac" already built-in
[     5.557] (**) NVIDIA(0): Depth 24, (--) framebuffer bpp 32
[     5.557] (==) NVIDIA(0): RGB weight 888
[     5.557] (==) NVIDIA(0): Default visual is TrueColor
[     5.557] (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
[     5.557] (II) Applying OutputClass "nvidia" options to /dev/dri/card0
[     5.557] (**) NVIDIA(0): Option "SLI" "OFF"
[     5.557] (**) NVIDIA(0): Option "ConnectToAcpid" "True"
[     5.557] (**) NVIDIA(0): Option "AllowEmptyInitialConfiguration"
[     5.557] (**) NVIDIA(0): NVIDIA SLI disabled.
[     5.557] (**) NVIDIA(0): Enabling 2D acceleration
[     6.286] (--) NVIDIA(0): Valid display device(s) on GPU-0 at PCI:2:0:0
[     6.286] (--) NVIDIA(0):     CRT-0
[     6.286] (--) NVIDIA(0):     DFP-0
[     6.286] (--) NVIDIA(0):     DFP-1
[     6.286] (--) NVIDIA(0):     DFP-2
[     6.286] (--) NVIDIA(0):     DFP-3
[     6.286] (--) NVIDIA(0):     DFP-4
[     6.286] (--) NVIDIA(0):     DFP-5
[     6.286] (--) NVIDIA(0):     DFP-6 (boot)
[     6.286] (--) NVIDIA(0):     DFP-7
[     6.287] (II) NVIDIA(0): NVIDIA GPU GeForce GTX 980 Ti (GM200-A) at PCI:2:0:0 (GPU-0)
[     6.287] (--) NVIDIA(0): Memory: 6291456 kBytes
[     6.287] (--) NVIDIA(0): VideoBIOS: 84.00.32.00.0a
[     6.287] (II) NVIDIA(0): Detected PCI Express Link width: 16X
[     6.290] (--) NVIDIA(GPU-0): CRT-0: disconnected
[     6.290] (--) NVIDIA(GPU-0): CRT-0: 400.0 MHz maximum pixel clock
[     6.290] (--) NVIDIA(GPU-0): 
[     6.292] (--) NVIDIA(GPU-0): DFP-0: disconnected
[     6.292] (--) NVIDIA(GPU-0): DFP-0: Internal TMDS
[     6.292] (--) NVIDIA(GPU-0): DFP-0: 330.0 MHz maximum pixel clock
[     6.292] (--) NVIDIA(GPU-0): 
[     6.292] (--) NVIDIA(GPU-0): DFP-1: disconnected
[     6.292] (--) NVIDIA(GPU-0): DFP-1: Internal TMDS
[     6.292] (--) NVIDIA(GPU-0): DFP-1: 165.0 MHz maximum pixel clock
[     6.292] (--) NVIDIA(GPU-0): 
[     6.292] (--) NVIDIA(GPU-0): DFP-2: disconnected
[     6.292] (--) NVIDIA(GPU-0): DFP-2: Internal DisplayPort
[     6.292] (--) NVIDIA(GPU-0): DFP-2: 960.0 MHz maximum pixel clock
[     6.292] (--) NVIDIA(GPU-0): 
[     6.293] (--) NVIDIA(GPU-0): DFP-3: disconnected
[     6.293] (--) NVIDIA(GPU-0): DFP-3: Internal TMDS
[     6.293] (--) NVIDIA(GPU-0): DFP-3: 165.0 MHz maximum pixel clock
[     6.293] (--) NVIDIA(GPU-0): 
[     6.293] (--) NVIDIA(GPU-0): DFP-4: disconnected
[     6.293] (--) NVIDIA(GPU-0): DFP-4: Internal DisplayPort
[     6.293] (--) NVIDIA(GPU-0): DFP-4: 960.0 MHz maximum pixel clock
[     6.293] (--) NVIDIA(GPU-0): 
[     6.293] (--) NVIDIA(GPU-0): DFP-5: disconnected
[     6.293] (--) NVIDIA(GPU-0): DFP-5: Internal TMDS
[     6.293] (--) NVIDIA(GPU-0): DFP-5: 165.0 MHz maximum pixel clock
[     6.293] (--) NVIDIA(GPU-0): 
[     6.293] (--) NVIDIA(GPU-0): Ancor Communications Inc ASUS PB287Q (DFP-6): connected
[     6.293] (--) NVIDIA(GPU-0): Ancor Communications Inc ASUS PB287Q (DFP-6): Internal DisplayPort
[     6.293] (--) NVIDIA(GPU-0): Ancor Communications Inc ASUS PB287Q (DFP-6): 960.0 MHz maximum pixel clock
[     6.293] (--) NVIDIA(GPU-0): 
[     6.294] (--) NVIDIA(GPU-0): DFP-7: disconnected
[     6.294] (--) NVIDIA(GPU-0): DFP-7: Internal TMDS
[     6.294] (--) NVIDIA(GPU-0): DFP-7: 165.0 MHz maximum pixel clock
[     6.294] (--) NVIDIA(GPU-0): 
[     6.296] (==) NVIDIA(0): 
[     6.296] (==) NVIDIA(0): No modes were requested; the default mode "nvidia-auto-select"
[     6.296] (==) NVIDIA(0):     will be used as the requested mode.
[     6.296] (==) NVIDIA(0): 
[     6.297] (II) NVIDIA(0): Validated MetaModes:
[     6.297] (II) NVIDIA(0):     "DFP-6:nvidia-auto-select"
[     6.297] (II) NVIDIA(0): Virtual screen size determined to be 3840 x 2160
[     6.312] (--) NVIDIA(0): DPI set to (157, 161); computed from "UseEdidDpi" X config
[     6.312] (--) NVIDIA(0):     option
[     6.312] (--) Depth 24 pixmap format is 32 bpp
[     7.031] (--) NVIDIA(0): Valid display device(s) on GPU-1 at PCI:3:0:0

It appears there are a number of bugs/regressions:

  1. Why does the NVIDIA display driver not revert to ‘console’ mode (which can still be accessed by pressing CTRL-ALT-Fn) during this failure state? (This may also be a linux kernel or X-related issue, in which case the question is why don’t they do this)?
  2. NVIDIA fails to try all display devices before deciding that there is no monitor to display to when initializing an X session, and will only try the primary graphics adapter (whatever your motherboard decides that would be). For me, resolving the problem means simply connecting the monitor to another GPU. However! This can be a pain for users with multiple cards and multiple monitors. I would imagine it could involve having to disconnect monitors in order to boot. I currently am not in posession of a second display-port monitor to test this theory, but maybe other users can.

Things have recently gotten worse: now the nVidia driver, the linux kernel, and the Bios each have their own ideas about what the primary GPU actually is.

Nvidia thinks it’s GPU-1. Bios and kernel think it’s GPU-0. The kernel and Bios won’t display anything with the monitor connected to GPU-1, while the NVidia driver won’t display anything with the monitor on GPU-0. Suffice to say that this can result in some real hassles.

It means if my system for whatever other reason can’t boot with the nvidia driver’s ‘primary’, then I have a black screen and can’t easily diagnose: I have to re-plug my monitor in the other GPU, fix the problem, then turn it off again, switch the cables again, then boot properly.

If for any reason a problem happens between the initialization of the GPU X session and the actual loading of the display manager, I’would have no way to see what happens.

Any ideas on how to resolve this issue?