Hi, I want to use the “source counter” of nsight compute to see the execution of CUDA kernel and SASS instructions. I have rebuild pytorch with addition flags:
USE_CUDA=1 DEBUG_CUDA=1 python setup.py develop ===> Success build Pytorch
In CMakeLists.txt of Pytorch, DEBUG_CUDA enables the “–lineinfo”. But after I try again to profile a torch.mm or torch.softmax, it still can not see any CUDA C kernel correlation SASS in “source counter”. I wonder to know where is the problem or it is impossible to perform this ?
I also try to add “-lineinfo” in CUDA_NVCC_FLAGS, but still not work.
Resolved, I update to newest nsight compute, it works