Unable to deactivate the Watchdog on second unused GPU on Xorg

I have an Ubuntu 20.04 system with two monitors and two Nvidia GTX gpus.
Driver Version: 470.82.00 CUDA Version: 11.4

The two monitors are connected to the first GPU in PCI:1:0:0. The second GPU in PCI:2:0:0 has no monitors connected and is supposed to operate CUDA intense simulations.

Despite following USING CUDA AND X | NVIDIA and trying every possible configuration of /etc/X11/xorg.conf , I am not able to disable the Watchdog on the second GPU.
Effectively, the first point in the article is not working as intended.

The ironic thing tho is that I am able to disable the watchdog only on the first GPU using the option “Interactive” “0”. But this is useless as the first GPU has to render the X screens and can not be used for long CUDA computing.

After trying everything, I have no idea on what option or configuration to try in order to solve this issue and properly use my second GPU.

This is my current Xorg configuration:

# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 470.82.00

Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    Option         "Interactive" "0"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

This is the output I obtain from numba.cuda.detect()

Found 2 CUDA devices
id 0    b'NVIDIA GeForce GTX 970'                              [SUPPORTED]
                      Compute Capability: 5.2
                           PCI Device ID: 0
                              PCI Bus ID: 1
                                Watchdog: Disabled
             FP32/FP64 Performance Ratio: 32
id 1    b'NVIDIA GeForce GTX 970'                              [SUPPORTED]
                      Compute Capability: 5.2
                           PCI Device ID: 0
                              PCI Bus ID: 2
                                Watchdog: Enabled
             FP32/FP64 Performance Ratio: 32
Summary:
	2/2 devices are supported

And this is the log from Xorg where the GPUs are detected:

[    28.775] (--) NVIDIA(0): Valid display device(s) on GPU-0 at PCI:1:0:0
[    28.775] (--) NVIDIA(0):     CRT-0
[    28.775] (--) NVIDIA(0):     DFP-0
[    28.775] (--) NVIDIA(0):     DFP-1 (boot)
[    28.775] (--) NVIDIA(0):     DFP-2
[    28.775] (--) NVIDIA(0):     DFP-3
[    28.775] (--) NVIDIA(0):     DFP-4
[    28.775] (II) NVIDIA(0): NVIDIA GPU NVIDIA GeForce GTX 970 (GM204-A) at PCI:1:0:0
[    28.776] (II) NVIDIA(0):     (GPU-0)
[    28.776] (--) NVIDIA(0): Memory: 4194304 kBytes
[    28.776] (--) NVIDIA(0): VideoBIOS: 84.04.36.00.5e
[    28.776] (II) NVIDIA(0): Detected PCI Express Link width: 16X
[    28.780] (--) NVIDIA(GPU-0): CRT-0: disconnected
[    28.780] (--) NVIDIA(GPU-0): CRT-0: 400.0 MHz maximum pixel clock
[    28.780] (--) NVIDIA(GPU-0): 
[    28.785] (--) NVIDIA(GPU-0): DFP-0: disconnected
[    28.785] (--) NVIDIA(GPU-0): DFP-0: Internal TMDS
[    28.785] (--) NVIDIA(GPU-0): DFP-0: 330.0 MHz maximum pixel clock
[    28.785] (--) NVIDIA(GPU-0): 
[    28.832] (--) NVIDIA(GPU-0): HAT GS1901 (DFP-1): connected
[    28.832] (--) NVIDIA(GPU-0): HAT GS1901 (DFP-1): Internal TMDS
[    28.833] (--) NVIDIA(GPU-0): HAT GS1901 (DFP-1): 165.0 MHz maximum pixel clock
[    28.833] (--) NVIDIA(GPU-0): 
[    28.833] (--) NVIDIA(GPU-0): AOC Q2770 (DFP-2): connected
[    28.833] (--) NVIDIA(GPU-0): AOC Q2770 (DFP-2): Internal DisplayPort
[    28.833] (--) NVIDIA(GPU-0): AOC Q2770 (DFP-2): 960.0 MHz maximum pixel clock
[    28.833] (--) NVIDIA(GPU-0): 
[    28.835] (--) NVIDIA(GPU-0): DFP-3: disconnected
[    28.835] (--) NVIDIA(GPU-0): DFP-3: Internal TMDS
[    28.835] (--) NVIDIA(GPU-0): DFP-3: 165.0 MHz maximum pixel clock
[    28.835] (--) NVIDIA(GPU-0): 
[    28.836] (--) NVIDIA(GPU-0): DFP-4: disconnected
[    28.836] (--) NVIDIA(GPU-0): DFP-4: Internal TMDS
[    28.836] (--) NVIDIA(GPU-0): DFP-4: 330.0 MHz maximum pixel clock
[    28.836] (--) NVIDIA(GPU-0): 
[    28.847] (==) NVIDIA(0): 
[    28.847] (==) NVIDIA(0): No modes were requested; the default mode "nvidia-auto-select"
[    28.847] (==) NVIDIA(0):     will be used as the requested mode.
[    28.847] (==) NVIDIA(0): 
[    28.848] (II) NVIDIA(0): Validated MetaModes:
[    28.848] (II) NVIDIA(0):     "DFP-1:nvidia-auto-select,DFP-2:nvidia-auto-select"
[    28.848] (II) NVIDIA(0): Virtual screen size determined to be 4480 x 1440
[    28.853] (--) NVIDIA(0): DPI set to (113, 114); computed from "UseEdidDpi" X config
[    28.853] (--) NVIDIA(0):     option
[    28.853] (**) NVIDIA(G0): Depth 24, (--) framebuffer bpp 32
[    28.853] (==) NVIDIA(G0): RGB weight 888
[    28.853] (==) NVIDIA(G0): Default visual is TrueColor
[    28.853] (==) NVIDIA(G0): Using gamma correction (1.0, 1.0, 1.0)
[    28.853] (**) NVIDIA(G0): Enabling 2D acceleration
[    28.853] (II) NVIDIA: The X server supports PRIME Render Offload.
[    28.854] (--) NVIDIA(0): Valid display device(s) on GPU-1 at PCI:2:0:0
[    28.854] (--) NVIDIA(0):     CRT-0
[    28.854] (--) NVIDIA(0):     DFP-0
[    28.854] (--) NVIDIA(0):     DFP-1
[    28.854] (--) NVIDIA(0):     DFP-2
[    28.855] (--) NVIDIA(0):     DFP-3
[    28.855] (--) NVIDIA(0):     DFP-4
[    28.855] (II) NVIDIA(G0): NVIDIA GPU NVIDIA GeForce GTX 970 (GM204-A) at PCI:2:0:0
[    28.855] (II) NVIDIA(G0):     (GPU-1)
[    28.855] (--) NVIDIA(G0): Memory: 4194304 kBytes
[    28.855] (--) NVIDIA(G0): VideoBIOS: 84.04.1f.00.2b
[    28.855] (II) NVIDIA(G0): Detected PCI Express Link width: 16X
[    28.859] (--) NVIDIA(GPU-1): CRT-0: disconnected
[    28.859] (--) NVIDIA(GPU-1): CRT-0: 400.0 MHz maximum pixel clock
[    28.859] (--) NVIDIA(GPU-1): 
[    28.863] (--) NVIDIA(GPU-1): DFP-0: disconnected
[    28.863] (--) NVIDIA(GPU-1): DFP-0: Internal TMDS
[    28.863] (--) NVIDIA(GPU-1): DFP-0: 330.0 MHz maximum pixel clock
[    28.863] (--) NVIDIA(GPU-1): 
[    28.863] (--) NVIDIA(GPU-1): DFP-1: disconnected
[    28.863] (--) NVIDIA(GPU-1): DFP-1: Internal TMDS
[    28.863] (--) NVIDIA(GPU-1): DFP-1: 165.0 MHz maximum pixel clock
[    28.863] (--) NVIDIA(GPU-1): 
[    28.864] (--) NVIDIA(GPU-1): DFP-2: disconnected
[    28.864] (--) NVIDIA(GPU-1): DFP-2: Internal DisplayPort
[    28.864] (--) NVIDIA(GPU-1): DFP-2: 960.0 MHz maximum pixel clock
[    28.864] (--) NVIDIA(GPU-1): 
[    28.864] (--) NVIDIA(GPU-1): DFP-3: disconnected
[    28.864] (--) NVIDIA(GPU-1): DFP-3: Internal TMDS
[    28.864] (--) NVIDIA(GPU-1): DFP-3: 165.0 MHz maximum pixel clock
[    28.864] (--) NVIDIA(GPU-1): 
[    28.864] (--) NVIDIA(GPU-1): DFP-4: disconnected
[    28.864] (--) NVIDIA(GPU-1): DFP-4: Internal TMDS
[    28.864] (--) NVIDIA(GPU-1): DFP-4: 330.0 MHz maximum pixel clock
[    28.864] (--) NVIDIA(GPU-1): 
[    28.864] (II) NVIDIA(G0): Validated MetaModes:
[    28.864] (II) NVIDIA(G0):     "NULL"
[    28.864] (II) NVIDIA(G0): Virtual screen size determined to be 640 x 480
[    28.864] (WW) NVIDIA(G0): Unable to get display device for DPI computation.
[    28.864] (==) NVIDIA(G0): DPI set to (75, 75); computed from built-in default
[    28.865] (II) NVIDIA: Reserving 6144.00 MB of virtual memory for indirect memory
[    28.865] (II) NVIDIA:     access.
[    28.886] (II) NVIDIA(0): Setting mode "DFP-1:nvidia-auto-select,DFP-2:nvidia-auto-select"
[    28.992] (==) NVIDIA(0): Disabling shared memory pixmaps
[    28.992] (==) NVIDIA(0): Backing store enabled
[    28.992] (==) NVIDIA(0): Silken mouse enabled
[    28.992] (**) NVIDIA(0): DPMS enabled

Please… help me sort this out… without this my current setup becomes almost unusable.

Hi,

Try add this to xorg.conf:

Section “ServerFlags”
Option “AutoAddGPU” “off”
EndSection