Seems like found a bug, when i trying to use nsight systems to profile a cuda program, main problem code be like:
if(condition check)
{
return;
}
cudaKernel<<<>>>();
cudaDeviceSynchronize();
this program can work properly, but in nsight systems, there is no CUDA HW infomation, and streams, kernels too. But once i removed the return line, nsight systems works fine
Does code flow always execute the cudaKernel? Is there a possibility that ‘if’ condition is true under certain conditions which are more likely to happen when you profile the code using nsight system? Does nsight systems profile the kernels before the if condition?
It would help to debug the issue if you can provide us with the minimal reproducer and nsys command line options.