register count for an OpenCL kernel

I’m trying to use nvidia’s opencl profiler from command line. It seems that I’m not able to get the register count.

I use

and config contains regperworkitem

when I look at the *.log files however I get this:
“NV_Warning: Ignoring the invalid profiler config option: regperworkitem” and so I get only the values for the default counters (method,gputime,cputime,occupancy)

does anyone know what I’m doing wrong?


Ciao cconti,

You are right.
It seems that with the nvidia opencl profiler 3.0, “regperworkitem” is not accepted as profiler parameter.

Fix: put “regperthread” in the config file, instead of “regperworkitem”,
and the profiler output will show you a field named “regperworkitem”:


OPENCL_DEVICE 0 Tesla T10 Processor

TIMESTAMPFACTOR fffff711e90e7498

timestamp=[ 2125626.000 ] method=[ memcpyHtoAasync ] gputime=[ 373.344 ] cputime=[ 1238.000 ] streamid=[ 1 ]

In my opinion, this is a bug.


Wouldn’t it be great if OpenCL used CUDA terminology… I find it confusing when reading CUDA and OpenCL stuff as I always forget which terms mean the same thing.

Did you log this in the NVDeveloper bug report system?