cuda-dbg forces execution on device 0 cuda-gdb forces execution on gpu running X windows

robullelk · June 14, 2009, 8:03pm

Hi All,

I’m running two geforce 9600s on fedora core 9. I have the display in twinview. I have the monitors hooked up to the geforce (device 0) on bus slot 6, see xorg.conf below.

Problem: I do cudSetDevice(1) and program runs normally on device 1. I do cuda-gdb ./program and set break points, and then when I run progrma I get:

(cuda-gdb) run
Starting program: /home/rdemb/cudaprograms/device1_gdb
[Thread debugging using libthread_db enabled]
[New process 3466]
[New Thread 3585360 (LWP 3466)]
Warning: 1 GPUs were made unavailable to the application because they are used by X. This may change the application behaviour!
we are on device 0
CUDA-GDB: Cannot debug on this GPU as it is running a window system.

As you can see, cuda-gdb seems to force execution on device 0 - even though I’m setting the device to 1

When I step past the line containing cudaSetDevice, I get the above message:

(cuda-gdb) step
23 cudaSetDevice(set_dev);
(cuda-gdb) step
Warning: 1 GPUs were made unavailable to the application because they are used by X. This may change the application behaviour!

I’ve re-installed the drivers twice with the x-server off (in runmode 3). Run nvidia-xconfig -twinview, but I always get forced onto device 0 when I try to use cuda-gdb.

If I take out the BusID in the xorg.conf file, I get an error at boot up time with xorg.conf, and the X -server does not come up.

If I run fedora core 9 at runlevel 3 and execute the programs from the command line, no cuda-enabled cards are recognized, i.e., if I run deviceQuery, I get a message that no cuda cards were detected, and it runs the program in emulation mode.

I’m really wrapped around the axle on this one, any help really appreciated. Thanks,

robullelk

xorg.conf below:

nvidia-xconfig: X configuration file generated by nvidia-xconfig

nvidia-xconfig: version 1.0 (buildmeister@builder62) Thu Apr 30 16:21:56 PD

T 2009

Xorg configuration created by pyxf86config

Section “ServerLayout”
Identifier “Default Layout”
Screen 0 “Screen0” 0 0
Screen 1 “Screen1” RightOf “Screen0”
InputDevice “Mouse0” “CorePointer”
InputDevice “Keyboard0” “CoreKeyboard”
EndSection

Section “InputDevice”

# generated from default
Identifier     "Mouse0"
Driver         "mouse"
Option         "Protocol" "auto"
Option         "Device" "/dev/input/mice"
Option         "Emulate3Buttons" "no"
Option         "ZAxisMapping" "4 5"

EndSection

Section “InputDevice”

keyboard added by rhpxl

Identifier     "Keyboard0"
Driver         "kbd"
Option         "XkbModel" "pc105"
Option         "XkbLayout" "us"

EndSection

Section “Monitor”
Identifier “Monitor0”
VendorName “Unknown”
ModelName “Unknown”
HorizSync 28.0 - 33.0
VertRefresh 43.0 - 72.0
Option “DPMS”
EndSection

Section “Monitor”
Identifier “Monitor1”
VendorName “Unknown”
ModelName “Unknown”
HorizSync 28.0 - 33.0
VertRefresh 43.0 - 72.0
Option “DPMS”
EndSection

Section “Device”
Identifier “Videocard0”
Driver “nvidia”
BusID “PCI:6:0:0”
Screen 0
EndSection

Section “Device”
Identifier “Videocard1”
Driver “nvidia”
BusID “PCI:6:0:0”
Screen 1
EndSection

Section “Screen”
Identifier “Screen0”
Device “Videocard0”
Monitor “Monitor0”
DefaultDepth 24
Option “TwinView” “True”
Option “MetaModes” “nvidia-auto-select, nvidia-auto-select”
SubSection “Display”
Viewport 0 0
Depth 24
EndSubSection
EndSection

Section “Screen”
Identifier “Screen1”
Device “Videocard1”
Monitor “Monitor1”
DefaultDepth 24
Option “TwinView” “True”
Option “MetaModes” “nvidia-auto-select, nvidia-auto-select”
SubSection “Display”
Viewport 0 0
Depth 24
EndSubSection
EndSection

netllama · June 14, 2009, 8:23pm

I don’t think your X configuration is doing what you think its doing. You should only have 1 defined Screen for Twinview (on a single GPU), yet you have two. Have you looked at your X log to confirm that both GPUs aren’t being used in X and that Twinview is working?

robullelk · June 14, 2009, 10:09pm

Hi,

I deleted the second screen and all references to it in xorg.conf - attached at bottom.

I get same behaviour. Twinview is working, but for some reason, X is still running on that 2nd geforce9.

I’ve excerpted some of the relevant parts from the xorg.0.log morlog file, below. I’m kind of troubled by the line:

Connected display device(s) on GeForce 9600 GT at

(–) NVIDIA(GPU-1): PCI:131:0:0: for GPU-1.

I’m not sure how to configure xorg.conf to prevent x from using the 2nd card.

Thank you for your help,

robullelk

(!!) More than one possible primary device found

(–) PCI: (0@6:0:0) nVidia Corporation Geforce 9600 GT 512mb rev 161, Mem @ 0x91000000/16777216, 0xa0000000/268435456, 0x92000000/33554432, I/O @ 0x00003000/128

(–) PCI: (0@131:0:0) nVidia Corporation Geforce 9600 GT 512mb rev 161, Mem @ 0xb1000000/16777216, 0xc0000000/268435456, 0xb2000000/33554432, I/O @ 0x00005000/128

(**) NVIDIA(0): Depth 24, (–) framebuffer bpp 32

(==) NVIDIA(0): RGB weight 888

(==) NVIDIA(0): Default visual is TrueColor

(==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)

(**) NVIDIA(0): Option “TwinView” “True”

(**) NVIDIA(0): Option “MetaModes” “nvidia-auto-select, nvidia-auto-select”

(**) NVIDIA(0): Enabling RENDER acceleration

(II) NVIDIA(0): Support for GLX with the Damage and Composite X extensions is

(II) NVIDIA(0): enabled.

(II) NVIDIA(0): NVIDIA GPU GeForce 9600 GT (G94) at PCI:6:0:0 (GPU-0)

(–) NVIDIA(0): Memory: 1048576 kBytes

(–) NVIDIA(0): VideoBIOS: 62.94.62.00.51

(II) NVIDIA(0): Detected PCI Express Link width: 16X

(–) NVIDIA(0): Interlaced video modes are supported on this GPU

(–) NVIDIA(0): Connected display device(s) on GeForce 9600 GT at PCI:6:0:0:

(–) NVIDIA(0): IBM L170p (CRT-0)

(–) NVIDIA(0): IBM L170p (CRT-1)

(–) NVIDIA(0): IBM L170p (CRT-0): 400.0 MHz maximum pixel clock

(–) NVIDIA(0): IBM L170p (CRT-1): 400.0 MHz maximum pixel clock

(**) NVIDIA(0): TwinView enabled

(II) NVIDIA(0): Assigned Display Devices: CRT-0, CRT-1

(II) NVIDIA(0): Validated modes:

(II) NVIDIA(0): “nvidia-auto-select,nvidia-auto-select”

(II) NVIDIA(0): Virtual screen size determined to be 2560 x 1024

(–) NVIDIA(0): DPI set to (95, 96); computed from “UseEdidDpi” X config

(–) NVIDIA(0): option

(==) NVIDIA(0): Enabling 32-bit ARGB GLX visuals.

(–) Depth 24 pixmap format is 32 bpp

(II) NVIDIA(GPU-1): NVIDIA GPU GeForce 9600 GT (G94) at PCI:131:0:0 (GPU-1)

(–) NVIDIA(GPU-1): Memory: 1048576 kBytes

(–) NVIDIA(GPU-1): VideoBIOS: 62.94.62.00.51

(II) NVIDIA(GPU-1): Detected PCI Express Link width: 16X

(–) NVIDIA(GPU-1): Interlaced video modes are supported on this GPU

(–) NVIDIA(GPU-1): Connected display device(s) on GeForce 9600 GT at

(–) NVIDIA(GPU-1): PCI:131:0:0:

(II) NVIDIA(0): Initialized GPU GART.

(II) NVIDIA(0): Setting mode “nvidia-auto-select,nvidia-auto-select”

(II) Loading extension NV-GLX

(II) NVIDIA(0): NVIDIA 3D Acceleration Architecture Initialized

(==) NVIDIA(0): Disabling shared memory pixmaps

(II) NVIDIA(0): Using the NVIDIA 2D acceleration architecture

(==) NVIDIA(0): Backing store disabled

(==) NVIDIA(0): Silken mouse enabled

nvidia-xconfig: X configuration file generated by nvidia-xconfig

nvidia-xconfig: version 1.0 (buildmeister@builder62) Thu Apr 30 16:21:56 PD

T 2009

Xorg configuration created by pyxf86config

Section “ServerLayout”

Identifier     "Default Layout"

Screen      0  "Screen0" 0 0

InputDevice    "Mouse0" "CorePointer"

InputDevice    "Keyboard0" "CoreKeyboard"

EndSection

Section “InputDevice”

generated from default

Identifier     "Mouse0"

Driver         "mouse"

Option         "Protocol" "auto"

Option         "Device" "/dev/input/mice"

Option         "Emulate3Buttons" "no"

Option         "ZAxisMapping" "4 5"

EndSection

Section “InputDevice”

keyboard added by rhpxl

Identifier     "Keyboard0"

Driver         "kbd"

Option         "XkbModel" "pc105"

Option         "XkbLayout" "us"

EndSection

Section “Monitor”

Identifier     "Monitor0"

VendorName     "Unknown"

ModelName      "Unknown"

HorizSync       28.0 - 33.0

VertRefresh     43.0 - 72.0

Option         "DPMS"

EndSection

Section “Monitor”

Identifier     "Monitor1"

VendorName     "Unknown"

ModelName      "Unknown"

HorizSync       28.0 - 33.0

VertRefresh     43.0 - 72.0

Option         "DPMS"

EndSection

Section “Device”

Identifier     "Videocard0"

Driver         "nvidia"

BusID          "PCI:6:0:0"

Screen          0

EndSection

Section “Device”

Identifier     "Videocard1"

Driver         "nvidia"

BusID          "PCI:6:0:0"

Screen          1

EndSection

Section “Screen”

Identifier     "Screen0"

Device         "Videocard0"

Monitor        "Monitor0"

DefaultDepth    24

Option         "TwinView" "True"

Option         "MetaModes" "nvidia-auto-select, nvidia-auto-select"

SubSection     "Display"

    Viewport    0 0

    Depth       24

EndSubSection

EndSection

netllama · June 14, 2009, 10:13pm

Hi,

I deleted the second screen and all references to it in xorg.conf - attached at bottom.

I get same behaviour. Twinview is working, but for some reason, X is still running on that 2nd geforce9.

I’ve excerpted some of the relevant parts from the xorg.0.log morlog file, below. I’m kind of troubled by the line:

Connected display device(s) on GeForce 9600 GT at

(–) NVIDIA(GPU-1): PCI:131:0:0: for GPU-1.

I’m not sure how to configure xorg.conf to prevent x from using the 2nd card.

Thank you for your help,

robullelk

(!!) More than one possible primary device found

(–) PCI: (0@6:0:0) nVidia Corporation Geforce 9600 GT 512mb rev 161, Mem @ 0x91000000/16777216, 0xa0000000/268435456, 0x92000000/33554432, I/O @ 0x00003000/128

(–) PCI: (0@131:0:0) nVidia Corporation Geforce 9600 GT 512mb rev 161, Mem @ 0xb1000000/16777216, 0xc0000000/268435456, 0xb2000000/33554432, I/O @ 0x00005000/128

(**) NVIDIA(0): Depth 24, (–) framebuffer bpp 32

(==) NVIDIA(0): RGB weight 888

(==) NVIDIA(0): Default visual is TrueColor

(==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)

(**) NVIDIA(0): Option “TwinView” “True”

(**) NVIDIA(0): Option “MetaModes” “nvidia-auto-select, nvidia-auto-select”

(**) NVIDIA(0): Enabling RENDER acceleration

(II) NVIDIA(0): Support for GLX with the Damage and Composite X extensions is

(II) NVIDIA(0): enabled.

(II) NVIDIA(0): NVIDIA GPU GeForce 9600 GT (G94) at PCI:6:0:0 (GPU-0)

(–) NVIDIA(0): Memory: 1048576 kBytes

(–) NVIDIA(0): VideoBIOS: 62.94.62.00.51

(II) NVIDIA(0): Detected PCI Express Link width: 16X

(–) NVIDIA(0): Interlaced video modes are supported on this GPU

(–) NVIDIA(0): Connected display device(s) on GeForce 9600 GT at PCI:6:0:0:

(–) NVIDIA(0): IBM L170p (CRT-0)

(–) NVIDIA(0): IBM L170p (CRT-1)

(–) NVIDIA(0): IBM L170p (CRT-0): 400.0 MHz maximum pixel clock

(–) NVIDIA(0): IBM L170p (CRT-1): 400.0 MHz maximum pixel clock

(**) NVIDIA(0): TwinView enabled

(II) NVIDIA(0): Assigned Display Devices: CRT-0, CRT-1

(II) NVIDIA(0): Validated modes:

(II) NVIDIA(0): “nvidia-auto-select,nvidia-auto-select”

(II) NVIDIA(0): Virtual screen size determined to be 2560 x 1024

(–) NVIDIA(0): DPI set to (95, 96); computed from “UseEdidDpi” X config

(–) NVIDIA(0): option

(==) NVIDIA(0): Enabling 32-bit ARGB GLX visuals.

(–) Depth 24 pixmap format is 32 bpp

(II) NVIDIA(GPU-1): NVIDIA GPU GeForce 9600 GT (G94) at PCI:131:0:0 (GPU-1)

(–) NVIDIA(GPU-1): Memory: 1048576 kBytes

(–) NVIDIA(GPU-1): VideoBIOS: 62.94.62.00.51

(II) NVIDIA(GPU-1): Detected PCI Express Link width: 16X

(–) NVIDIA(GPU-1): Interlaced video modes are supported on this GPU

(–) NVIDIA(GPU-1): Connected display device(s) on GeForce 9600 GT at

(–) NVIDIA(GPU-1): PCI:131:0:0:

(II) NVIDIA(0): Initialized GPU GART.

(II) NVIDIA(0): Setting mode “nvidia-auto-select,nvidia-auto-select”

(II) Loading extension NV-GLX

(II) NVIDIA(0): NVIDIA 3D Acceleration Architecture Initialized

(==) NVIDIA(0): Disabling shared memory pixmaps

(II) NVIDIA(0): Using the NVIDIA 2D acceleration architecture

(==) NVIDIA(0): Backing store disabled

(==) NVIDIA(0): Silken mouse enabled

nvidia-xconfig: X configuration file generated by nvidia-xconfig

nvidia-xconfig: version 1.0 (buildmeister@builder62) Thu Apr 30 16:21:56 PD

T 2009

Xorg configuration created by pyxf86config

Section “ServerLayout”
Identifier     "Default Layout"

Screen      0  "Screen0" 0 0

InputDevice    "Mouse0" "CorePointer"

InputDevice    "Keyboard0" "CoreKeyboard"
EndSection

Section “InputDevice”

generated from default
Identifier     "Mouse0"

Driver         "mouse"

Option         "Protocol" "auto"

Option         "Device" "/dev/input/mice"

Option         "Emulate3Buttons" "no"

Option         "ZAxisMapping" "4 5"
EndSection

Section “InputDevice”

keyboard added by rhpxl
Identifier     "Keyboard0"

Driver         "kbd"

Option         "XkbModel" "pc105"

Option         "XkbLayout" "us"
EndSection

Section “Monitor”
Identifier     "Monitor0"

VendorName     "Unknown"

ModelName      "Unknown"

HorizSync       28.0 - 33.0

VertRefresh     43.0 - 72.0

Option         "DPMS"
EndSection

Section “Monitor”
Identifier     "Monitor1"

VendorName     "Unknown"

ModelName      "Unknown"

HorizSync       28.0 - 33.0

VertRefresh     43.0 - 72.0

Option         "DPMS"
EndSection

Section “Device”
Identifier     "Videocard0"

Driver         "nvidia"

BusID          "PCI:6:0:0"

Screen          0
EndSection

Section “Device”
Identifier     "Videocard1"

Driver         "nvidia"

BusID          "PCI:6:0:0"

Screen          1
EndSection

Section “Screen”
Identifier     "Screen0"

Device         "Videocard0"

Monitor        "Monitor0"

DefaultDepth    24

Option         "TwinView" "True"

Option         "MetaModes" "nvidia-auto-select, nvidia-auto-select"

SubSection     "Display"

    Viewport    0 0

    Depth       24

EndSubSection
EndSection

Please generate and attach an nvidia-bug-report.log.gz

robullelk · June 14, 2009, 11:23pm

Hi,

I did startx – -logverbose 6 and generated the attached file.

Can you recommend a 2nd board that may work while I wait for a resolution.

I tried a GTX 8800 and got as far as setting the break point in the kernel, but when I stepped into the kernel I got something to the effect that the GTX 8800 is not supported for cuda-gdb.

Perhaps this problem is invariant w/r to cards on fedora core 9, but if anyone reading this has been able to debug cuda programs on a card that is not attached to the display on fedora core 9, please let me know.

Thank you very much for your assistance,

Robert
nvidia_bug_report.log.gz (53.9 KB)

netllama · June 15, 2009, 2:00pm

In your bug report, I see:

(!!) More than one possible primary device found

(–) PCI: (0@6:0:0) nVidia Corporation Geforce 9600 GT 512mb rev 161, Mem @ 0x91000000/16777216, 0xa0000000/268435456, 0x92000000/33554432, I/O @ 0x00003000/128

(–) PCI: (0@131:0:0) nVidia Corporation Geforce 9600 GT 512mb rev 161, Mem @ 0xb1000000/16777216, 0xc0000000/268435456, 0xb2000000/33554432, I/O @ 0x00005000/128

That’s a known X server bug, and I suspect that’s why both GPUs are getting ‘owned’ by X. I think the only workaround is to go to runlevel 5, then switch back to runlevel 3.

G80 GPUs (GTX 8800) are not supported with cuda-gdb.

robullelk · June 16, 2009, 7:31am

That workaround works. Do you know what X server revs have the bug fix. I tried the cards on sles10 sp1 and the log states “one primary device found,” but I was unable to try it there because cuda-gdb exits with “floating point exception.” sles10 sp1 has X server rev 6.9.0 with a release date of 12 dec 2005.

Thank you for your assistance, I am up and debugging. Good timely and accurate info.

Regards,

robullelk

netllama · June 16, 2009, 1:43pm

The X server bug isn’t present in RHEL5, which is actually the only supported environment for using cuda-gdb anyway.

Topic		Replies	Views
2 Tesla C1060s with a legacy GeForce FX 5200 card Need help editing the xorg.conf file for multiple CUDA Programming and Performance	28	35536	January 29, 2009
Dual GeForce GTX or Titan V on mobo, unable to display upon launching Ubuntu 18.04 Linux	35	3062	April 10, 2020
OpenGL, NVIDIA and Ubuntu 14.04 issues Linux	28	17387	September 22, 2017
[SOLVED] Run CUDA on dedicated NVIDIA GPU while connecting monitors to Intel HD graphics, is this possible? CUDA Setup and Installation	15	71928	December 9, 2018
GPU not detected by nvidia-settings only Linux	4	8395	June 10, 2018
CUDA 2.1 discussion CUDA Programming and Performance	71	63941	February 17, 2009
Cuda broken in 396.24.02 and 396.24.10 Vulkan beta drivers on Linux Linux	47	9004	October 14, 2021
Install Problem CUDA Programming and Performance	32	12708	December 17, 2009
X fails to start after installing cuda device drivers CUDA Programming and Performance	2	2113	July 3, 2012
Problems running CUDA on non-primary display CUDA Programming and Performance	23	54505	June 27, 2008

cuda-dbg forces execution on device 0 cuda-gdb forces execution on gpu running X windows

nvidia-xconfig: X configuration file generated by nvidia-xconfig

nvidia-xconfig: version 1.0 (buildmeister@builder62) Thu Apr 30 16:21:56 PD

Xorg configuration created by pyxf86config

keyboard added by rhpxl

nvidia-xconfig: X configuration file generated by nvidia-xconfig

nvidia-xconfig: version 1.0 (buildmeister@builder62) Thu Apr 30 16:21:56 PD

Xorg configuration created by pyxf86config

generated from default

keyboard added by rhpxl

Related topics