Hi all,
I’m setting CUDA_PROFILE=1 to use the cuda profiler from command line. The only counters logged for my kernel are gpuTime, cpuTime and occupancy.
Do I need to do anything else to get a full log?
Thanks much.
Hi all,
I’m setting CUDA_PROFILE=1 to use the cuda profiler from command line. The only counters logged for my kernel are gpuTime, cpuTime and occupancy.
Do I need to do anything else to get a full log?
Thanks much.
Hi all,
I’m setting CUDA_PROFILE=1 to use the cuda profiler from command line. The only counters logged for my kernel are gpuTime, cpuTime and occupancy.
Do I need to do anything else to get a full log?
Thanks much.
You need to set COMPUTE_PROFILE_CONFIG to the name of a profiler configuration file that enables the counters as desired. Check the profiler documentation in the doc directory for details.
You need to set COMPUTE_PROFILE_CONFIG to the name of a profiler configuration file that enables the counters as desired. Check the profiler documentation in the doc directory for details.
There is no counter for texture-cache hit / miss rate.
What you can do is profile the texture cache requests and the texture cache misses and calculate the hit / miss rate from them.
for requests the counter name is tex0_cache_sector_queries
for misses the counter name is tex0_cache_sector_misses
Note the above values are for tesla C2050 (CUDA 2.0 architecture). For 2.1 architecture you need to also get tex1_cache_sector_queries(misses)
To calculate the hit rate it is = (texture_requests - texture_misses) / (texture_requests)
This information is in the Compute Visual Profiler Document (Page 59 onwards)