Question about ncu profiling
I try to profile the cuda-samples kernels. One can get profile, other can not get profile. Is there any reason? It appeares follows in failed case.
==WARNING== No kernels were profiled.
/opt/nvidia/NVIDIA-Nsight-Compute-2021.2/ncu --target-processes all --set default matrixMul
/opt/nvidia/NVIDIA-Nsight-Compute-2021.2/ncu --target-processes all --set default bandwidthTest
Thanks for your help.
Assuming you’re running this test cuda-samples/Samples/1_Utilities/bandwidthTest at master · NVIDIA/cuda-samples · GitHub . I don’t think there are any CUDA kernels in that test. It is just a bunch of memory copies to test memory performance. Nsight Compute is mainly for profiling CUDA kernels as they run on the device. For example, this is a kernel in matrixMul:
<<<grid, threads, 0, stream>>>(d_C, d_A, d_B, dimsA.x, dimsB.x);
Nsight Systems could help you visualize memory copy performance. Please let me know if this answers your question or if there are any additional details I could provide.
Thank for your suggestion. I am try to dig the problem.
It seems ncu is heavy for profiling for my case.(nsys profile can run successfully)
I try to consider getting each CUDA kernel for ncu separately.