CUDA Pro Tip: Generate Custom Application Profile Timelines with NVTX

Originally published at:

The last time you used the timeline feature in the NVIDIA Visual Profiler, Nsight VSE or the new Nsight Systems to analyze a complex application, you might have wished to see a bit more than just CUDA API calls and GPU kernels. In this post I will show you how you can use the NVIDIA…

Is it me or that all the colors in your example are transpararent?

Hi Elad, sorry for the long delay with my response. That all the colors are transparent was caused by the alpha channels set to 00 in the initial version of this post. The post and the code examples have been fixed. Thanks Jiri


Is there a way that I can find the duration of NVTX range? I have function which contains a mix of CPU and GPU activity. Using Nsight Systems would give me the runtime of just the kernels, but I was wondering if there is any functionality in the NVTX API that can let me gather the duration of the NVTX range around this function?

Hi, NSight systems displays NVTX ranges. You might need to expand some rows to see them. In addition to that you can get some statistics also for NVTX ranges with --stats (see NVTX does not provide an API to query the runtime of an already passed range. Hope this helps Jiri