I’m a little bit lost. I have used CUPTI a lot of times but a contact suggest me to try PerfWorks. I didn’t know PerfWorks but when I visited its webpage it seemed to be outdated:
PerfWorks has now been replaced with an updated C++ API and a new name - NVIDIA® Nsight™ Perf SDK
Then I visited the NVIDIA NSIGHT Perf SDK webpage and I downloaded the package. I noted that some of these include files were present in the CUPTI include folder. All these nvperf_* headers. Some of the libraries too.
Then I visited the CUPTI documentation and I found PerfWorks information there! But not in the module category, so no explanations on how to use it.
But I saw that the 2.7. CUPTI Profiling API was too similar to the Range Profiler which NVIDIA Nsight Perf SDK Getting Started Guide documentation refers about…
So, is all the same thing with different names? What documentation I have to attack first? I wanted to test the PerfWorks metrics because CUPTI Metrics seemed to be obsolete and added a lot of overhead.
(Moved this post to the CUDA Profiler Tools Interface (CUPTI) category)
Nvidia Nsight PerfSDKsupports graphics APIs (i.e. DirectX, Vulkan, OpenGL) allowing collection of GPU performance metrics at graphics device, context and queue levels.
CUPTI Profiling APIsupports profiling of CUDA kernels and it allows collection of GPU performance metrics for a particular kernel or range of kernels at the CUDA context level.
BothNvidia Nsight PerfSDKandCUPTI Profiling APIshare the host APIs (i.e. metrics enumeration, configuration and evaluation) but differ in which GPU APIs they target on the device.
I suppose you are asking about overhead differences between Nsight PerfSDK and CUPTI Profiling API. The CUPTI Profiling API should not add any significant additional overhead compared to Nsight PerfSDK.