PerfWorks documentation

Hello,

I’m trying to add PerfWorks as backend to my measurement application (GitHub - RRZE-HPC/likwid: Performance monitoring and benchmarking suite) but where is the documentation? For CUPTI exists a good online documentation but the stuff I’m interested in (Events & Metrics) is not supported by Turing chips anymore. So, I have to use PerfWorks but there is no documentation. The samples in the CUPTI folders are not sufficient.

The small snippet:
NVPW_Device_GetNames_Params getNamesParams = {NVPW_Device_GetNames_Params_STRUCT_SIZE};
getNamesParams.deviceIndex = 0;
NVPA_Status err = NVPW_Device_GetNames(&getNamesParams);
Always returns NVPA_STATUS_INVALID_ARGUMENT but without documentation, it’s impossible to get why.

Hi Thomas,
We have not made PerfWorks publicly available as of this time. There are some NDA customers that have a legacy version (no Turing support) and documentation is available with that version. The updated PerfWorks is the backend to other Nsight tools with support for Turing but is not available standalone as an SDK at this time. Our plans are to update the API and make it available to NDA customers as we migrate to new features/architectures over the next year and Public customers soon after NDA.

Thanks,

Scott

Hi,

how should you measure stuff on Turing since CUPTI (Events & Metrics) is not supported anymore for this chip generation? The CUDA 10.1 package contains even samples, headers and libraries with the PerfWorks API.

There is a bit of documentation at https://docs.nvidia.com/cupti/Cupti/r_main.html#r_host_metrics_api

At the bottom of this page:
“The CUPTI event APIs from the header cupti_events.h and metric APIs from the header cupti_metrics.h will be deprecated in a future CUDA release. The NVIDIA Volta platform is the last architecture on which these APIs are supported. These are being replaced by a new set of Profiling API in the header cupti_profiler_target.h and Perfworks metrics API in the headers nvperf_host.h and nvperf_target.h. These provide low and deterministic profiling overhead on the target system.”

So, for Volta and Turing, I should use the CUPTI Profiling API (CUPTI :: CUPTI Documentation)? But I cannot measure any events or metrics because they are not supported anymore bei CC 7.5?

Yes, the public API for profiling on Turing is the CUPTI Profiling API. The former CUPTI events and metrics are superseded by the new metric names listed in this table: CUPTI :: CUPTI Documentation

Thomas could you collect metrics on Turing using the CUPTI Profiling APIs?
Let us know if you faced any issues?

It’s more of a question of putting my resources into that task. I don’t want to put effort in adding the CUPTI Profiling API and then it is changed again (Pure CUPTI → CUPTI Profiling API → PerfWorks).

We understand your concern. The new CUPTI Profiling APIs will be supported for Volta and later GPU architectures. This change was required as we have migrated CUPTI to the new Perfworks based backend for GPU metric collection.