Memory SOL Throughput %

Hello,

I profiled two kernels run on the same inputs and the same graphics card, but I received the results attached in the screenshot. The two quantities I am most concerned with are the Memory Throughput [%] with respect to SOL, and the Memory Throughput [Gbyte/second]. In the first attachment, I received 1.46% of the SOL while I have 2.66 Gbyte/second, and in the second attachment I have 0.61% of the SOL while I have 5.75 Gbyte/second. My question is, shouldn’t the program with the higher % of SOL have a higher throughput in Gbyte/second?

My first thought was that it was because of the large number of passes that the profiler does on my application, with each result possibly being inconsistent, so I tried profiling the same program with the same inputs yet again but still found this same problem.

Any thoughts on what I might be getting wrong or misunderstanding?

Thanks!


The Memory Throughput is the max of many memory subsystem metrics. dram__cycles_active.avg.pct_of_peak_sustained_elapsed or DRAM Throughput is one of the metrics but in your case it is unlikely to be the highest value. You can find the Memory Throughput breakdown by clicking on the right arrow next to GPU/Speed of Light Throughput.

The grid being analyzed is only filling 1.93% of the SMs over the capture period and is less than .1 of a wave. It is very hard for the tool to provide useful information when the grid/range does not fill the GPU. This small of workload cannot saturate the memory system so expect very low %.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.