It looks like there are two profilers at this point, nsys and ncu. Which should I be using? What are the benefits of each? I’m on CUDA 11.4, and I don’t expect to upgrade to 12.x any time soon. If it matters, I’ll be profiling within a docker container, so I do need a way to generate profile data via the command line. I can view via GUI outside the container. I’ll be profiling on both x86-64 and aarch64 (Xavier Orin).
this blog series: 1 2 3 is designed to answer these questions. You would generally start with nsight systems (nsys). When you have narrowed your focus down to the behavior of a kernel or kernels, you would shift to nsight compute (ncu).
1 Like
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.