I have a DGX A100 machine, and each machine has 8 200G single-port IB network cards. When I use the following command to test the bandwidth between the two at the same time, I found that I did not get what I expected. I thought it was Eight network cards can reach the rate of 200Gb/s at the same time, but the actual rate is greatly reduced.
They are connected to a QM8790 at the same time (use 200G AOC line)
Client：run perftest multi devices -d mlx5 0,mlx5 1,mlx5 2 -C 0,1,2 -cmd “ib write bw --report gbits” --remote 126.96.36.199
I think it may be a problem with the way I use the command, so I am eager to get some guidance, or the correct way to test the bandwidth of 16 IB network cards on two DGXs at the same time