Question about OpenCL Profiling

thanh_tuan · April 19, 2011, 12:01pm

Hi,
I’m trying to use profiling information of OpenCL kernels. I set OPENCL_PROFILE=1 and made a config.txt file containing the performance counters that I want to see. And set OPENCL_PROFILE_CONFIG=config.txt
I also set build option to “-cl-nv-verbose”
But there are only a few features to be profiled, such as occupancy, timestamp, divergent_branch.
The followings are not profiled. Could anyone tell me how to make them work in OpenCL (without having to use the visual profiler - I only want to see through text files)?

NV_Warning: Ignoring the invalid profiler config option: regperworkitem
NV_Warning: Ignoring the invalid profiler config option: workgroupsize
NV_Warning: Ignoring the invalid profiler config option: regperworkitem
NV_Warning: Can’t monitor multi bus-width signal branch in this run
NV_Warning: Signal branch can not be profiled in this run.
NV_Warning: Signal gld_request can not be profiled in this run.
NV_Warning: Signal gst_request can not be profiled in this run.
NV_Warning: Ignoring the invalid profiler config option: instructions
NV_Warning: Ignoring the invalid profiler config option: warp_serialize

philipjfry · April 19, 2011, 12:41pm

Where did you get your keywords from? Unfortunately, the column titles in the Visual Profiler differ from the keywords in the configuration file, and even the documentation in /usr/local/cuda/doc/Compute_Profiler.txt has at least two errors. If unsure, you may use the Visual Profiler to export an CSV file of a run and at look at the column titles there, those refer to the configuration file keywords again.

Moreover, you cannot measure arbitrary combinations of events at the same time, so usually you will have to perform multiple runs of your application measuring a limited set of events in each (that’s one of the reasons the Visual Profiler performs up to 12 runs!). There are some comments on which events can be combined, but about capability 2 it is especially vague (“The number of counters that can be profiled in a single run depends on the specific counters selected on GPUs with Compute Capability 2.0 or higher.”).

Finally, not all events are supported on all cards (i.e. compute capabilities, see /usr/local/cuda/computeprof/doc/computeprof.html).

Regards,
Markus

thanh_tuan · April 26, 2011, 1:14pm

Hi Philip,
Thanks for your reply. That really helps.

Visual profiler has some way to calculate more meaningful statistics based from the raw performance counters, such as active warps and L1 cache miss. Is there documents where we can have more information about how to interpret the performance counters information into high level information?

By the way, are the there real performance counters in GPU? Or is it just a simulation technique from Nvidia?

Thanks,
Tuan

philipjfry · April 26, 2011, 1:52pm

Search for “Supported derived statistics” in “/usr/local/cuda/computeprof/doc/computeprof.html”. Most of the derived metrics are explained there.

thanh_tuan · April 27, 2011, 6:23am

Thanks a lot,
Tuan

Melinda23 · August 24, 2011, 9:23am

Hello)

I am trying to optimize my OpenCL kernels and all I have right now is NVidia Visual Profiler,which seems rather constrained. I would like to see line-by-line profile of kernels to better understand issues with coalescing, etc. Is there a way to get more thorough profiling data than the one, provided by Visual Profiler?

Topic		Replies	Views
Commandline Profiling Trying to use the profiler via the command line CUDA Programming and Performance	3	10512	August 3, 2011
Opencl Visual profiling CUDA Programming and Performance	3	5356	April 23, 2010
OpenCL commandline profiling need to know how to extract certain parameters CUDA Programming and Performance	1	5094	August 1, 2011
command line profiling CUDA Programming and Performance	10	5621	August 23, 2010
Commandline Profiler says: "NV_Warning: Profiler counters are disabled for this context." ? CUDA Programming and Performance	1	1545	May 31, 2012
Nvidia GPU - OpenCL Profiling CUDA Programming and Performance	1	4119	August 2, 2015
Questions about NVVP Visual Profiler and nvprof	1	666	April 7, 2019
NVIDIA profiler not working for OpenCL even for SDK samples CUDA Programming and Performance	2	10378	January 16, 2011
CUI for CUDA Profiler CUDA Programming and Performance	4	1513	April 17, 2009
Registers per thread CUDA Programming and Performance	4	2308	July 8, 2011

Question about OpenCL Profiling

Related topics