CUDA_PROFILE=1 only logs some of the counters

Hi all,

I’m setting CUDA_PROFILE=1 to use the cuda profiler from command line. The only counters logged for my kernel are gpuTime, cpuTime and occupancy.

Do I need to do anything else to get a full log?

Thanks much.

Hi all,

I’m setting CUDA_PROFILE=1 to use the cuda profiler from command line. The only counters logged for my kernel are gpuTime, cpuTime and occupancy.

Do I need to do anything else to get a full log?

Thanks much.

You need to set COMPUTE_PROFILE_CONFIG to the name of a profiler configuration file that enables the counters as desired. Check the profiler documentation in the doc directory for details.

You need to set COMPUTE_PROFILE_CONFIG to the name of a profiler configuration file that enables the counters as desired. Check the profiler documentation in the doc directory for details.

@tera
what’s is the counter for texture-cache hit/miss rate ? I can’t find it in the documents

@tera
what’s is the counter for texture-cache hit/miss rate ? I can’t find it in the documents

There is no counter for texture-cache hit / miss rate.

What you can do is profile the texture cache requests and the texture cache misses and calculate the hit / miss rate from them.

for requests the counter name is tex0_cache_sector_queries

for misses the counter name is tex0_cache_sector_misses

Note the above values are for tesla C2050 (CUDA 2.0 architecture). For 2.1 architecture you need to also get tex1_cache_sector_queries(misses)

To calculate the hit rate it is = (texture_requests - texture_misses) / (texture_requests)

This information is in the Compute Visual Profiler Document (Page 59 onwards)