Cannot break in or step into CUDA __device__ functions

I am unable to set breakpoints in device functions and am unable to step into device functions. However, everything works correctly in the global kernel function, where it successfully stops at breakpoints and shows values of variables. Also, it seems that the device functions execute correctly. It is just that I am unable to debug them.

As for my system setup, I am using Nsight 2.2 (just downloaded today) with Visual Studio 2008 SP1 and a GeForce GTX 680 with driver version (10-2-2012), R306.97 (branch: r306_41-13). I am also instructing nvcc to output debug info (-G) via the properties menu.

Any help would be much appreciated.

Which CUDA toolkit version are you using?

I am not sure what version of the toolkit I was using. I am guessing 4.2. It is whatever is bundled with the Nsight 2.2 debugger I guess.

Actually, I got an email about Nsignt 3.0 RC and installed that. With the new Nsight, I can debug device functions correctly. The only reason I was using 2.2 before is because I just signed up for the developer program, and hadn’t gotten any emails about Nsight 3.0 yet. I guess you guys fixed the problem already. I was actually quite confused that everyone was talking about 3.0 but could find no download for it.

P.S. Is there any way to get emails sent about replies to threads? I didn’t see an option in the setup page for the forum.

Glad to hear that it’s working with 3.0 RC1. 3.0 RC1 was the first public release of 3.0.