Thank you for the explanation on pc-sampling-period …that helps
Additionally, if I want to obtain the GPU event count of various events say inst_issued0, inst_executed, local_load etc. at every ‘n’ milliseconds of execution of a CUDA program say ./transpose…then how to obtain ?
nvprof doesn’t support sampling of the events and metrics except for NVLink metrics. Command line option “–event-collection-mode” can be set to value “continuous” to enable the sampling of NVLink metrics. See profiler doc section Profiler :: CUDA Toolkit Documentation for more details.
On the other hand, CUPTI APIs support continuous mode for a larger set of events and metrics. CUPTI sample event_sampling shows how to use the event APIs to sample events using a separate host thread. Useful links:
Overview: CUPTI :: CUPTI Documentation
Samples: CUPTI :: CUPTI Documentation