I have tried to use NVIDIA Nsight Next-Gen in Visual Studio 22 Community Edition. It works fine when looking at device parameters in the main() program but when I try to step in a kernel it does not work. If I try to put a breakpoint inside a kernel I get the message “The breakpoint will not be hit. No executable part of the debuggers code is associated with this breakpoint.” I have placed the breakpoint at a line that sets a float to zero. If I do not place a breakpoint in the kernel when I step into it opens a window (see attached picture) that is trying to “find source.” Is there a way to make it run for simple debugging actions as well as the debugger for the host?
Hello @empeirikos2718 and thank you for reaching out. Just to check we are on the same page, I want to make sure that this is a debug build and with nothing special in the options that disables debug info, right?
This is definitely a debug build. How do I check that there is nothing in the options that disables debug info? Apologies I am new to this.
Hi @empeirikos2718, ok this is a debug build. You can (double-)check that debug info generation is enabled by opening project properties and looking under CUDA C/C++ > Device > Generate GPU Debug Information. The setting should be set to “Yes (-G)”:
Now zooming out a little, let’s check a couple of more things. You mentioned “stepping into a kernel” and I want to make sure we are on the same page. Stepping into (i.e. F11) a kernel launch statement, i.e. the line containing the triple angle brackets call to the kernel, would cause undesired behavior not unlike what you have indicated. To break inside a kernel, you need to either put a breakpoint inside the kernel or enable “Break on launch” from the Nsight sub-menu under the Extensions menu. But breakpoints, bring us to the second issue you are experiencing with the breakpoint not properly resolving. Can you please try adding 10 lines of space (like empty lines or placeholder comments) between your device (GPU) code and any host code and seeing if that helps? This might sound strange but there is a quirk related to how CUDA sources are processed vs how Visual Studio treats source lines that could be causing this. Please let me know if any of these suggestions work.
Thank you
Many thanks for your detailed answer. Fortunately I made sure the option “Generate GPU debug information” is set to “Yes.” On the subject of adding “10 lines of space … between your device (GPU) code and any host code”. Do you mean that if I write something like
__global__ void mykernel( some inputs)
{
int i = threadIdx.x;
some more code
}
I should modify it to be more like
__global__ void mykernel( some inputs)
{
empty space (at least ten lines)
int i = threadIdx.x;
some more code
}
Or perhaps the space you are referring to is between the kernel declaration
empty space
mykernel<<<Nblocks,Nthreads>> (some inputs);
empty space
and the rest of the code in the main() program?
The former 😉 But not inside the function – before and after the function:
// 10 lines of comments
__global__ void mykernel( some inputs)
{
int i = threadIdx.x;
some more code
}
// 10 lines of comments
By the way, the “break on launch” option (Extensions > Nsight > Break On Launch) might be of help here too.
Thank you.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

