Hello, we got a problem.
There are 5 servers, each connected to one 40 ports switch with one 200Gbps IB card (HDRx4).
Lets named server as A, B, C, D, E.
Test topology:
C <---- A -----> B
run following command:
- node B: start service on B
- ib_write_bw -S 0 -p 10000 -q 4
- node C: start service on C
- ib_write_bw -S 1 -p 10001 -q 4
- node A: send data to B via SL0 and send data to C via SL1
- ib_write_bw -S 0 -q 4 -p 10000 -D 1 -F --run_infinitely --report_gbits ip_of_B
- ib_write_bw -S 1 -q 4 -p 10001 -D 1 -F --run_infinitely --report_gbits ip_of_C
Test Result shows as following:
- A->B SL0 100Gbps
- A->C SL1 100Gbps
Then, we add more traffic to node C, make C as a congested node. Test topology:
D
↘
C <---- A -----> B
↗
E
running following commands:
- node C: start services, receive data from D and E
- ib_write_bw -S 1 -p 10002 -q 4
- ib_write_bw -S 1 -p 10003 -q 4
- node D: send data to C via SL1
- ib_write_bw -S 1 -q 4 -p 10002 -D 1 -F --run_infinitely --report_gbits ip_of_C
- node E: send data to C via SL1
- ib_write_bw -S 1 -q 4 -p 10003 -D 1 -F --run_infinitely --report_gbits ip_of_C
Test result shows as following:
- A->B SL0 80Gbps
- A->C SL1 66Gbps
- D->C SL1 66Gbps
- E->C SL1 66Gbps
My question is, why A->B is 80Gbps? Shouldn’t it be about 130Gbps?