cuda-gdb hang and compiled program spewing nonsense

Hi I’m fairly new to CUDA and I’ve been working on some code recently and I’m having some serious problems with cuda-gdb and my program in general.

I’m running this on:

Device 0:

Name: Tesla C1060

Compute capability: 1.3

Linux 2.6.18-194.11.1.el5 #1 SMP Tue Aug 10 16:39:28 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

NVRM version: NVIDIA UNIX x86_64 Kernel Module  260.19.26  Mon Nov 29 00:53:44 PST 2010

GCC version:  gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)

nvcc: NVIDIA (R) Cuda compiler driver

* The program will execute to completion by normal execution but it will hang in cuda-gdb in cudaMalloc() and it never finishes.

If I pause execution in cuda-gdb when stepping over cudaMalloc() I can run a backtrace and find

#0  0x0000003e366cd1c3 in __select_nocancel () from /lib64/

#1  0x00002b24d0f07fd8 in ?? () from /usr/lib64/

#2  0x00002b24d0f059fe in ?? () from /usr/lib64/

#3  0x00002b24d0f05cb9 in ?? () from /usr/lib64/

#4  0x00002b24d0ea3faf in ?? () from /usr/lib64/

#5  0x00002b24d0e9a3ed in ?? () from /usr/lib64/

#6  0x00002b24d0f6efa4 in ?? () from /usr/lib64/

#7  0x00002b24d0bc066d in ?? () from /usr/local/cuda/lib64/

#8  0x00002b24d0bb7b1a in ?? () from /usr/local/cuda/lib64/

#9  0x00002b24d0bb1379 in ?? () from /usr/local/cuda/lib64/

#10 0x00002b24d0be683b in cudaMalloc () from /usr/local/cuda/lib64/

#11 0x0000000000405c89 in Lattice::initialiseCuda (this=0x7fff35849b10) at

Why is cudaMalloc() stuck in __select_nocancel () from /lib64/ ??

Because of this hang I can’t debug my kernel which I believe has problems too.

If I comment out the kernel completely then cudaMalloc() does not hang but then I can’t do any debugging in the kernel.

This is the first cuda runtime is called which I believe does initialisation so maybe something is going wrong there?

* The application sometimes dumps to standard error or standard output (I’m not sure which) the following…

#.word 18, 130, 0x82aa8

#.word 18, 132, 0x82b58

#.word 18, 134, 0x82ba8

#.word 18, 136, 0x82bd0

#.word 18, 138, 0x82c20

#.word 18, 143, 0x82c48

and it continues like that for quite a while… Does anyone know what that stuff means.

My code is on GitHub if people want to see what I’m working with.


cuda-gdb hangs like that normally happen if you try debugging on a display GPU, which isn’t supported. If you have multiple devices you will need to make sure the process selects a non-display card to run on.

Yeah make sure X is not running if its a single GPU system i.e use console mode or use vnc or nxclient to remotely debug the target system.

Thanks for the replies. In this particular instance the server I was using was running X and I was using the first card which is probably what X is using. Is there a way to check? nvida-smi said it was plugged in but that doesn’t necessarily mean X is using it.

The machine has 4 cards in it. Presumably I’ll be able to use cuda-gdb with my program using one of those cards whilst X is running, right?

I think there is a bigger issue here than me picking the wrong card however as I have forced the program before to pick a different card before and it still hanged in cuda-gdb. I need to check where in the stack this is happening.

Unfortunately the machine has been shutdown over the weekend so I’ll report back in a few days.

The easiest thing to do on a machine running X11 is use nvidia-smi to put the display card(s) into compute prohibited mode. That will gaurantee that cuda-gdm can’t try establishing a context on a card which won’t work.

Couldn’t I just put all the other cards apart from the one X is using into COMPUTE exclusive mode?

Unfortunately I can’t use nvidia-smi to set the compute-mode as I do not have root access on the computer I’m running on. I have 4 cards… and nvidia-smi -q -a tells me… with the unnecessary stuff stripped out

Driver Version                  : 260.19.26

GPU 0:

        Product Name            : Quadro FX 3800

        PCI Device/Vendor ID    : 5ff10de

        PCI Location ID         : 0:1:0

        Display                 : Connected


            GPU                 : 0%

            Memory              : 3%

GPU 1:

        Product Name            : Tesla C1060

        PCI Device/Vendor ID    : 5e710de

        PCI Location ID         : 0:2:0

        Display                 : Not connected


            GPU                 : 0%

            Memory              : 0%

GPU 2:

        Product Name            : Tesla C1060

        PCI Device/Vendor ID    : 5e710de

        PCI Location ID         : 0:83:0

        Display                 : Not connected


            GPU                 : 0%

            Memory              : 0%

GPU 3:

        Product Name            : Tesla C1060

        PCI Device/Vendor ID    : 5e710de

        PCI Location ID         : 0:84:0

        Display                 : Not connected


            GPU                 : 0%

            Memory              : 0%

However I have tried calling each of these before cudaMalloc() was called and cudaMalloc() would still hang!

cudaSetDevice(1); //cudaMalloc() still hanged later!

cudaSetDevice(2); //cudaMalloc() still hanged later!

cudaSetDevice(3); //cudaMalloc() still hanged later!

I’ve produced self contain code which causes the problems I’ve been having with cuda-gdb. Would anyone be willing to try to reproduce this on their system? is attached to this post.

To compile run…

nvcc -G -g -o test

Now running cuda-gdb

cuda-gdb test

(gdb) break 68

(gdb) run 0 6 3

(gdb) n

###IT WILL HANG HERE! Press CTRL+C to pause execution

Program received signal SIGINT, Interrupt.

0x0000003e366cd1c3 in __select_nocancel () from /lib64/

Thanks in advance. (2.89 KB)

I also just noticed that in Appendix B known issues in the cuda-gdb manual. It says the following:

So it shouldn’t matter if one of the cards is being used by X. cuda-gdb should ignore it (even if I don’t call cudaSetDevice(); - cudaMalloc() still hangs later!).

I’ve been in contact with someone from Nvidia and we eventually tracked down the problem.

The system I use is a multi-user system and several of the users are working CUDA projects right now.

Although I am aware it isn’t possible to have multiple users using cuda-gdb at the same time this wasn’t the case as I checked that no one else was running cuda-gdb by running.

ps -AF | grep cuda

The hanging in cudaMalloc() whilst trying to use cuda-gdb was caused by old cuda-gdb session (ran by another user) that had crashed and had left a pipe in the /tmp/

Running the following command shows this:

ls -l /tmp/cuda*

prw-r----- 1 dl7749 stapdev    0 Feb 15 10:18 /tmp/cuda-gdb.pipe.2.3

-rw------- 1 aa7086 stapdev 2680 Feb 10 16:43 /tmp/cudagdb.rtobj.jrmNEx

-rw------- 1 dl7749 stapdev 2680 Feb 15 10:18 /tmp/cudagdb.rtobj.R0Nyw3

Deleting the old pipe fixed ( /tmp/cuda-gdb.pipe.2.3 ) (you might as well delete the temporary object files too) the problem.

The lesson here is… If cuda-gdb crashes clean up the contents /tmp/ so you can run cuda-gdb again.

It would be nice if cuda-gdb handled this situation more gracefully instead of giving the impression that the program being debugged is making cudaMalloc() to hang but at least there is a solution to the problem now.