How to solve the Deadlock in Nsight compute

My device is A100 with cuda-11.7. I use nvcc -arch=sm_80 … to complie the *.cu. Everything is ok when running the executable file(successfully meeting the end) . But when I use ncu to check the kernel, it fails to stop. I guess it encount the deadlock, because I use atomicCAS in my kernel. When I used the compute capability sm_52 to compile it, deadlock happened in the executable file. So how can I set the compute capability when use ncu to solve the deadlock?

Hi, @lingjiayao_cn

Sorry for the issue you met.
Have you tried latest NCU version aka 2024.3 ? Which OS do you use ?
Also can you provide the src code for us to repro ?