I’m profiling a kernel using nvprof and ncu.
The gld_efficiency metric using nvprof shows this:
But the corresponding metric in nsight comput show this:
I see in the manual that they are the same metric, both shows if there are any waste in bandwidth. But why they differ so much?
which is right ?? thank you !