How to calculate the theoretical memory bandwidth?

jimurk · August 8, 2019, 9:13am

Hi, everyone. In NVIDIA turing architecture whitepaper, the theoretical memory bandwidth is calculated as Memory interface * Memory Clock rate / 8 (GB/s). But in an old doc, reduction.ppt in CUDA samples, it is Memory interface * Memory Clock rate * 2 / 8 (GB/s). Which one should be adopted?

Thanks in advance!

Robert_Crovella · August 8, 2019, 1:29pm

It’s going to depend on how the memory clock rate is indicated/specified. Specifically, its going to depend on how the memory clock rate translates to transfers per second. There is not one single answer.

jimurk · August 9, 2019, 12:24am

Do you means that the given memory clock rate is a theoretical peak value and the actual memory clock rate varies during data transfer? Since Memory interface * Memory Clock rate is doubled in reduction.ppt, does NVIDIA GPUs support simultaneous bidirectional data transfer for now?

Robert_Crovella · August 9, 2019, 3:01am

No, I don’t mean any of those things. None of those things are true.

The peak theoretical memory bandwidth is transfers/second * bits/transfer * 1byte/8bits

Note that memory clock rate doesn’t appear in the above formula.

Depending on what source you are using for memory clock rate, you will have to determine how that relates to transfers/second.

jimurk · August 9, 2019, 4:05am

But according to NVIDIA turing architecture whitepaper, Memory Bandwidth (GB/sec) is Memory Interface * Memory Clock (Data Rate) / 8 bits. Given that Memory Interface is 352 bit and Memory Clock (Data Rate) is 11 Gbps for GTX 1080Ti, Can Memory Bandwidth of 484 GB/sec be treated as its theoretical memory bandwidth?

What do you mean the source for memory clock rate?

Robert_Crovella · August 9, 2019, 3:05pm

And if I responded to that, suggesting that is correct, you would immediately point out the example from the reduction paper, which you already pointed out.

I’m not going to go back and forth on this. Both are correct. There isn’t just one formula that applies to every case, for every situation. If you want to come up with one formula, I personally would do exactly what I already suggested. Create a formula that is based on transfers per second. Then for each situation you come across, see what relationship memory clock rate must have to transfers per second, in order to make that single equation correct.

The peak theoretical memory bandwidth of GTX 1080Ti is published:

https://arstechnica.com/gadgets/2017/03/nvidia-gtx-1080-ti-review/

You can decide for yourself what formula and memory clock rate you wish to use, to acknowledge the correctness of that published number.

I probably won’t be able to respond to further requests for clarification on this topic. The topic of what is the GPU memory bandwidth formula has been discussed in many places. If you want to read some of those, you may reach some conclusions that are useful for you.

jimurk · August 10, 2019, 1:00am

Hi, Robert_Crovella. Thanks for your patient answer. It is an interesting topic, i will read something about that as you suggested.

tushar210610 · December 18, 2024, 7:02am

The difference of *2 factor could be due to half duplex bandwidth vs full duplex bandwidth. Can you confirm if the white papers mention anything about the bi-directional transfer rate?

Curefab · December 18, 2024, 12:20pm

Normally DRAM does not have full-duplex. The difference more likely comes from DDR (double data rate) memory, which uses the falling and rising edge of the clock signal.

One open question is whether (with newer GPUs) the memory clock is an actual physical memory clock or just the transfer rate, see for example

It is a type of GDDR SDRAM (graphics DDR SDRAM), and is the successor to GDDR5. Just like GDDR5X it uses QDR (quad data rate) in reference to the write command clock (WCK) and ODR (Octal Data Rate) in reference to the command clock (CK).

So there are different clock speeds at the same time. For easier product comparison or tuning (overclocking, etc.) Nvidia probably only talks about a single frequency.

Topic		Replies	Views
GPU Memory how to find the GPU memory bandwidth CUDA Programming and Performance	10	17652	June 23, 2007
THEORETICAL BANDWIDTH vs EFFECTIVE BANDWIDTH CUDA Programming and Performance	13	6861	February 23, 2017
Is the GDDR5X transfer size 256B on the GTX 1080 Ti? CUDA Programming and Performance	6	1208	September 28, 2017
memory interface width CUDA Programming and Performance	1	2884	June 9, 2010
Memory clock rate CUDA Programming and Performance	1	2094	November 24, 2019
Theoretical and actual values of cuda memory transfer rate CUDA Programming and Performance cuda	6	890	September 10, 2020
Memory Speed Calculation CUDA Programming and Performance	3	846	May 4, 2011
Is my bandwidth calculation right? bandwidth CUDA Programming and Performance	3	1451	November 13, 2009
Frequency Vs Memory Bandwidth CUDA Programming and Performance	2	8803	September 5, 2009
Using bandwidthTest tool, D2D performance More than the official given bandwidth CUDA Programming and Performance cuda	6	846	October 28, 2022

How to calculate the theoretical memory bandwidth?

Related topics