Can someone help me understand the profiling metric "Compute Utilization"?

MutantJohn · May 3, 2015, 6:26pm

So like, what is the stat “Compute Utilization” mean? What’s considered a “good” compute utilization? Or rather, what’s a value that is considered performant?

I tried looking this up on my own and the gist of what I got is that it’s basically just the ratio of the total number of instructions of the process vs the total number of cycles the process took. Is this accurate?

MutantJohn · May 4, 2015, 4:10am

Is your post mistakenly empty?

Robert_Crovella · May 4, 2015, 4:33am

I wrote something, and decided it didn’t make sense, so I deleted it. I don’t know how to delete a posting entirely.

Seeky · May 4, 2015, 5:44am

If you mean the overall cycles during which actual work was done by the process, vs the maximum amount of cycles during the execution time, I agree.
Instructions might be a bit misleading, since this would mean you are not be able to achieve a 100% compute utilization with multi-cycle instructions.

I think, as so often, the answer to your question is: It depends.

If you have no divergent branches the compute utilization should converge to 100%.
In a kernel with 1 divergent branch the theoretical compute utilization will be 50% if I remember that correctly. The profiler assumes they are taken 50% each. In reality you will sometimes see a higher compute utilization there. This is due to the fact that real data often produces similar data in consecutive memory regions, resulting in full warps taking the same branch.

The major problem with telling what a good compute capability is that it most likely depends on the type of kernel you are building, and the amount of effort your are putting in it. Some applications might not allow creating a kernel which allows efficient processing on a GPU. Other applications can become faster if you optimize the processing flow of your data.

Additionally if you have a task which is very load/story intensive, while you will not do alot arithmetic with the read data, your compute utilization wont look very good either. If you do not have to do anything else with the data, then it is okay, it’s just a limit you will need to accept.

I would suggest to critically review the percent of compute utilization the profiler is showing you, and always comparing it to what you would expect from your specific application.

MutantJohn · May 4, 2015, 3:53pm

That’s interesting.

I only ask because it’s roughly ~90% for an application I’ve been working on.

To me, that’s amazing! I literally went from like 10% utilization to something that far away from 100%.

I will always strive towards 100% but I think 90% is good enough (for now). I hope this means the code scales to other systems as well… It’d be neat to see the times on a better GPU.

Seeky · May 5, 2015, 5:43am

90% sounds pretty well to me. Just bear in mind that this doesn’t automatically mean that you wrote efficient code.

By the way, Nsight gave me the answer to your question today. Occupancy/Utilization is defined as follows: Ratio of the average active warps per active cycle to the maximum number of warps supported on a multiprocessors.

MutantJohn · May 5, 2015, 3:35pm

Thanks!

Topic		Replies	Views
Exact meaning of "occupancy" Slightly confused CUDA Programming and Performance	2	2271	April 20, 2009
Question about optimizations related to profiler CUDA Programming and Performance	2	3248	June 25, 2010
Occupancy question Nsight Compute	2	129	December 13, 2024
Kernel Occupancy Could someone explain this? CUDA Programming and Performance	1	11882	March 19, 2010
I want to know means about CUPTI metrics in details. CUPTI – CUDA Profiler Tools Interface	2	1283	October 12, 2021
Runtinme occupancy CUDA Programming and Performance	5	1850	January 9, 2009
GPU utilization CUDA Programming and Performance	3	6849	March 22, 2018
occupancy and performance also a question about .cubin files CUDA Programming and Performance	6	2207	December 9, 2009
Some questions on GPU utilization CUDA Programming and Performance	5	4240	October 8, 2021
Visual Profiler says my occupancy is 221% CUDA Programming and Performance	4	1768	April 14, 2013

Can someone help me understand the profiling metric "Compute Utilization"?

Related topics