Profiling OpenCL with visual profiler

What are people’s experiences with using the visual profiler on OpenCL code? Does it work as expected? I’m using version 4.0.10.

I’m getting profiling information out, but I’m suspicious of it.

  1. At the end of a series of profiling runs I get the message “Profiler data file ‘temp_compute_profiler_0_1.csv’ does not contain profiler output.” I think I’m releasing all resources.
  2. I have a kernel that almost certainly has coalesced memory transfers, but the profiler says that the memory accesses are not coalesced.

While there is certainly a chance, or maybe it’s even likely, that I’m wrong about my assumptions in 1) and 2), it would make me much more comfortable to hear if others see this behavior also, or others get everything to work perfectly.

Thanks much.


I haven’t tried it myself but I believe it should work. I’m not sure about 1) but I have seen this error message in cuda before also. For 2) it is possible the the uncoalesced accesses are coming from registers spills.

Thanks for the reply. How would I determine if there are register spills?

Look at the number of registers used, if it is maxed out you are likely spilling (63 on Fermi). You can see this in the occupancy analysis tab.

Also you will want to look at the local memory counters (look for counters that have local in the name). They will tell you if there is any local traffic due to autospilling.

Finally another trick you could do is add code like this


//some code

around some of the accesses and then profile and see if the uncoalesced memory accesses go away.