Bandwidth test result for A800 does not meet expectations

peterzhang · August 22, 2023, 6:19am

We have physical computer node with A800 GPU*8, and its output of “nvidia-smi topo -m” is shown below:

However, when a virtual machine with A800 * 4 GPU created on the node, and we ran benchmark test on it. As shown in the above figure, the bidirection bandwidth was closed to 100GB/s and the single bandwidth was closed to 50GB/s.

It makes me confused that in 4GPU virtual machine, each GPU has 4 nvlinks. The nvlinke supports 50 GB/s bidirectional bandwidth, so the read&write bidirectional bandwidth on single GPU should be closed to 50GB/s * 4 = 200GB/s and single direction bandwidth should be closed to 25GB/s * 4 = 100GB/s.

Is there something wrong in my interpretation of aggregate bandwidth in A800, or in my understanding of NVLink?

Topic		Replies	Views
Interpretation of "total aggregate bandwidth" for HGX A100 CUDA Programming and Performance a100	9	2978	June 3, 2024
Confused about CUDA p2pbandwidthlatency sample GPU-Accelerated Libraries cuda	1	972	April 19, 2021
P2P Bandwidth measurements GPU - Hardware cuda	1	232	August 6, 2024
How to enable A800 80G version of nvlink on ubuntu Linux	1	568	December 13, 2023
NVLink and Quadro RTX 5000 Linux ubuntu	3	1652	February 22, 2022
NVLINK support for connecting 4 GPUs GPU - Hardware	9	9629	May 29, 2023
GPU Peer to Peer communication bandwidth Test result is confused (the furthest card is the best} CUDA Programming and Performance	0	385	March 25, 2020
DGX A100, when 8 IB network cards use ib_write_bw to test the bandwidth at the same time, the rate decreases, which is not expected DGX User Forum	0	503	May 31, 2023
P2p Bandwidth 150% higher than maximum achievable CUDA Programming and Performance cuda , ubuntu	10	2816	April 11, 2023
Single nvlink bridge between PCIE A100 40g GPU - Hardware	2	992	March 28, 2024

Bandwidth test result for A800 does not meet expectations

Related topics