I run samples/1_Utilities/bandwidthTest/bandwidthTest on a computer where a single 32GB GV100 installed, and the result is as below:
[CUDA Bandwidth Test] - Starting…
Running on…
Device 0: Quadro GV100
Quick Mode
Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 12.3
Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 13.2
Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 539.0
Result = PASS
I don’t know why this gives such low memory bandwidth, compared to what the spec sheet says (around 900GB/s).
I exhibit such lower memory bandwidth in other programs too, so guess it’s not the problem of the sample benchmark.
I also fixed ‘Graphics’ and ‘Memory’ clock to 1132MHz and 850 MHz, which is confirmed in ‘Clocks’ in nvidia-smi -q.
Also, (nvidia-smi -q) says that it has Link width of 16x and PCIe generation of 3.
The driver version is 455.23.05 and tested under ubuntu 18.04.5 LTS and linux 5.4.0-48.
How can I deal with this problem? Is it a normal behavior?