Trying to see the overlap between CPU and GPU functions and what the CPU is doing when GPU is idle. Can Nsight show what functions the CPU executes or idle time without having to instrument using NVTX?
I imagine it would be fairly coarse resolution (every 1 ms), but that would be good enough.
Here’s what I tried so far:
- enable CPU profiling in Nsight - doesn’t do anything
same problem this guy had
- nvprof --cpu-profiling on, and compile with -g (good profilers will use DWARF for unwinding call stack) and -fno-omit-frame-pointer (simpler way to unwind call stack)
does show some call stack, but just numeric addresses