nsight eclipse unable to debug host code (until hit device calls)


After setting up my project under nsight eclipse I was attempting to debug the host side code.

All is well until the first line of cuda device calls get executed, nsight then informs me that it cannot perform debugging on the same GPU that is driving the screen. This is very reasonable except I’m not attempting to debug any device code, in fact I’m not even compiling with -G flags and generating any device side debug symbols!

I’m compiling with -g -O0 -w. Under Ubuntu 1404 with cuda toolkit 6.5.

There must be a simple workaround so that i will at least be able to debug the host side code? Is there some odd nsight setting that I’m missing here?

In windows I can perform single GPU debugging via visual studio and nsight but that’s a different topic.

Big thanks for yout input!

eclipse constantly reminds me of software preemption debugging - single GPU debugging, etc; it requires compute capability 3.5 and can be set in the preferences

otherwise, consider tearing your project in 2, separating the host code and device code, by moving the device section to a shared or static library
then, you should be able to apply slightly different project settings to the device section, as it is part of your project, but in a separate project at the same time
it should then also be easy to jump/ skip/ “short”/ simulate the device section, so that you can only focus on the host part

you should be able to reduce your device section to a single library/ function call, you can draw in a release build thereof, and hopefully eclipse would simply let you step over it…

So on a single GPU with CC 3.5 I should be able to a) step over device calls on the host b) debug the actual device code ?

Im currently on a CC 3.0 device (CC 5.2 arrives tommorrow :-), can host side debugging still be performed on this CC 3.0 device?

Im afraid tearing up my project would be a huge undertaking, it would be simpler/cheaper to upgrade the hardware…

“So on a single GPU with CC 3.5 I should be able to a) step over device calls on the host b) debug the actual device code ?”

I believe so…

for a cc 3.0, i think you are asking for too much

but in CC 3.5 i need to turn on preemption right?


But being able to debug host side code on a CC 3.0 device doesn’t seem like an unreasonable requirement, this really can’t be done in nsight?

i doubt whether whoever would place in the message box cc 3.5 when whoever really meant cc 3.0…

your very capable device arrives in the morrow, so perhaps just hold your breath for a moment

yeah, just noticed that their is no compilation option for CC 5.2 in nsight but only up to 5.0, how to manually configure this xiaojimi?

If you want the compilation option for CC 5.2, update your CUDA toolkit to the CUDA 6.5 update for Maxwell Gen2:


The original production release was 6.5.14
The CC 5.2 update is 6.5.19

ah there is an additional toolkit 6.5 update! Thanks!

Still crashes in host code, now at random when inspecting variables in nsight.

This is running SM 5.2 devices with the latest toolkit:

Ubuntu 14.04

Anything I need to set up before attempting single gpu debugging on a linux system? I’m assuming that nsight uses GDB in the backend, does that need to be manually configured?


eclipse:: window>preferences>nsight> enable cuda software preemption debugging…?

yes its already enabled…


your salvation now amounts to txbob; he should be around shortly

but are you confident that single gpu debugging is indeed not working - it does not seem to crash at the first device code instance anymore…?

secondly, what happened to your old card; why not put grandpa next to the new kid on the block to drive the screen…?

i had some mixed success, crashing at first cuda call (device synchronizeA) aswell as actually managing to enter a kernel by setting a breakpoint and running (crasching inside kernel happened.,…).

The old card is at a different workplace…

i am not sure what “single gpu debugging is correctly enabled, but not working” would actually look like or manifest as…

perhaps write a very elementary host program with very elementary gpu kernel, just to test the water…

if you can successfully debug that…

Anyone have any clue on this issue? Have there been any related updates to the SDK to make single GPU debugging work as promised?