register count for an OpenCL kernel

hi,
I’m trying to use nvidia’s opencl profiler from command line. It seems that I’m not able to get the register count.

I use
export OPENCL_PROFILE_CONFIG=config

and config contains regperworkitem

when I look at the *.log files however I get this:
“NV_Warning: Ignoring the invalid profiler config option: regperworkitem” and so I get only the values for the default counters (method,gputime,cputime,occupancy)

does anyone know what I’m doing wrong?

thanks

Ciao cconti,

You are right.
It seems that with the nvidia opencl profiler 3.0, “regperworkitem” is not accepted as profiler parameter.

Fix: put “regperthread” in the config file, instead of “regperworkitem”,
and the profiler output will show you a field named “regperworkitem”:

OPENCL_PROFILE_LOG_VERSION 2.0

OPENCL_DEVICE 0 Tesla T10 Processor

TIMESTAMPFACTOR fffff711e90e7498

timestamp,method,gputime,cputime,regperworkitem,occupancy,streamid
timestamp=[ 2125626.000 ] method=[ memcpyHtoAasync ] gputime=[ 373.344 ] cputime=[ 1238.000 ] streamid=[ 1 ]

In my opinion, this is a bug.

diegor

Wouldn’t it be great if OpenCL used CUDA terminology… I find it confusing when reading CUDA and OpenCL stuff as I always forget which terms mean the same thing.

Did you log this in the NVDeveloper bug report system?