According to the deviceQuery output, the memory clock rate for 2080Ti is 7000MHz
CUDA Driver Version / Runtime Version 10.0 / 10.0
CUDA Capability Major/Minor version number: 7.5
Total amount of global memory: 10989 MBytes (11523260416 bytes)
(68) Multiprocessors, ( 64) CUDA Cores/MP: 4352 CUDA Cores
GPU Max Clock rate: 1545 MHz (1.54 GHz)
Memory Clock rate: 7000 Mhz
Memory Bus Width: 352-bit
However, in websites such as https://www.techpowerup.com/gpu-specs/geforce-gtx-1080-ti.c2877 the memory clock rate is written as
Memory Clock
1376 MHz
11008 MHz effective
May I know how deviceQuery calculates that rate (or fetches from the device)?
(you are comparing 2 different GPUs here)
7000 MHz (for 2080Ti) is the equivalent double-pumped (DDR) rate. double-pumped means two bit transfers are happening per lane/wire, per clock.
11008 MHz (for 1080Ti) is the equivalent single-pumped rate. Converting that to a double-pumped rate would involve division by 2, so approximately 5500 MHz.
This would suggest that 2080Ti memory bandwidth is about 7000/5500 the memory bandwidth of 1080 Ti, since both involve a 352-bit bus width.
2080Ti:
https://www.techpowerup.com/gpu-specs/geforce-rtx-2080-ti.c3305
1080Ti:
https://www.techpowerup.com/gpu-specs/geforce-gtx-1080-ti.c2877
616/484 = 1.27 (memory bandwidth ratio)
7000/5500 = 1.27 (clock ratio)
The 1376 number is just the 11008 number divided by 8. All modern clocking systems involve a base frequency that is multiplied up to give the actual clocking observed on the bus. The multiplier here (8) isn’t that important, and neither is the base frequency. The effective frequency is what is most conveniently used to compute available bandwidth. Make sure to scale correctly for DDR or SDR effective, and if comparing clocks, make sure to compare the same clocks (DDR to DDR or SDR to SDR).
Bandwidth calculation (2080Ti example):
7000 (effective DDR MHz) * 352 (bus width in bits) * 2 (bits per DDR clock) / 8 (bits per byte) = 616GB/s (published 2080Ti memory bandwidth)
2 Likes