OpenCL Profiling

As far as I can see, there’s no way to profile or even timeline OpenCL applications in NSight, correct? Aside from using event callbacks to get the raw start-stop times of kernels, what other options do I have for profiling / time-lining kernels in an OpenCL application?

(Note: OpenCL used because GPU code needs to be cross-vendor)

Correct, OpenCL is not supported by Nsight Compute or Nsight Systems.

That’s fine. Are there any tools that anyone can recommend as an alternative?

Also, no problem if there’s no comment on this, but what was the reasoning behind deprecating OpenCL profiling?

You could consider evaluating Score-P ( or TAU (, for example.

Great, thanks! I’ll take a look at those.