I’m relatively new to cuda programming. When debugging with nsight compute extension with visual studio, I always have the debugger breakpoint hit at the kernel function with even threadIds. Is it possible that the debugger ignores odd threadIds ?
Thanks for using our tool. But I do not quite understand your problem.
Can you clarify the steps you did in Nsight Compute and the output you got which you think is not proper ? (Maybe a screenshot)