The bandwidth used to test the code is not the same as that tested by perftest

1041884221 · June 5, 2024, 11:59am

In order to learn RDMA, I found an example on the Internet, which is similar to the one provided by MELLANOX , but when I used two machines to run, I found the following problems:

1.There is a big gap between the bandwidth of the code tested and that tested by Perftest .

2.In addition to this, the use of GID 0 or 2 on one of the two machines will significantly reduce the bandwidth.

Machine A:

configure:

hca_id: mlx5_bond_0
        transport:                      InfiniBand (0)
        fw_ver:                         20.39.3004
        node_guid:                      1070:fd03:00e5:f118
        sys_image_guid:                 1070:fd03:00e5:f118
        vendor_id:                      0x02c9
        vendor_part_id:                 4123
        hw_ver:                         0x0
        board_id:                       MT_0000000224
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                4096 (5)
                        active_mtu:             1024 (3)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        link_layer:             Ethernet

DEV     PORT    INDEX   GID                                     IPv4            VER     DEV
---     ----    -----   ---                                     ------------    ---     ---
mlx5_bond_0     1       0       fe80:0000:0000:0000:b0fc:4eff:feb3:1112                 v1      bond0
mlx5_bond_0     1       1       fe80:0000:0000:0000:b0fc:4eff:feb3:1112                 v2      bond0
mlx5_bond_0     1       2       0000:0000:0000:0000:0000:ffff:0a77:2e3d 10.119.46.61    v1      bond0
mlx5_bond_0     1       3       0000:0000:0000:0000:0000:ffff:0a77:2e3d 10.119.46.61    v2      bond0

test in perftest on GID 1

---------------------------------------------------------------------------------------
                    RDMA_Read BW Test
RX depth:               1
post_list:              1
inline_size:            0
 Dual-port       : OFF          Device         : mlx5_bond_0
 Number of qps   : 1            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 PCIe relax order: ON
 ibv_wr* API     : ON
 CQ Moderation   : 1
 Mtu             : 1024[B]
 Link type       : Ethernet
 GID index       : 1
 Outstand reads  : 16
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0000 QPN 0x1659 PSN 0xd4858a OUT 0x10 RKey 0x203e00 VAddr 0x007f38d0d07000
 GID: 254:128:00:00:00:00:00:00:176:252:78:255:254:179:17:18
 remote address: LID 0000 QPN 0x1c86 PSN 0xc2e51a OUT 0x10 RKey 0x013f00 VAddr 0x007f123fc62000
 GID: 254:128:00:00:00:00:00:00:100:155:154:255:254:172:09:41
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[MiB/sec]    BW average[MiB/sec]   MsgRate[Mpps]
 65536      1000             10829.53            10829.17       0.173267
---------------------------------------------------------------------------------------

Machine B:

hca_id: mlx5_bond_0
        transport:                      InfiniBand (0)
        fw_ver:                         20.39.3004
        node_guid:                      e8eb:d303:0032:b212
        sys_image_guid:                 e8eb:d303:0032:b212
        vendor_id:                      0x02c9
        vendor_part_id:                 4123
        hw_ver:                         0x0
        board_id:                       MT_0000000224
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                4096 (5)
                        active_mtu:             1024 (3)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        link_layer:             Ethernet

DEV     PORT    INDEX   GID                                     IPv4              VER     DEV
---     ----    -----   ---                                     ------------      ---     ---
mlx5_bond_0     1       0       fe80:0000:0000:0000:649b:9aff:feac:0929                   v1      bond0
mlx5_bond_0     1       1       fe80:0000:0000:0000:649b:9aff:feac:0929                   v2      bond0
mlx5_bond_0     1       2       0000:0000:0000:0000:0000:ffff:0a77:2e3e   10.119.46.62    v1      bond0
mlx5_bond_0     1       3       0000:0000:0000:0000:0000:ffff:0a77:2e3e   10.119.46.62    v2      bond0
n_gids_found=4

test in perftest on GID 0

                    RDMA_Read BW Test
RX depth:               1
post_list:              1
inline_size:            0
 Dual-port       : OFF          Device         : mlx5_bond_0
 Number of qps   : 1            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 PCIe relax order: ON
 ibv_wr* API     : ON
 CQ Moderation   : 1
 Mtu             : 1024[B]
 Link type       : Ethernet
 GID index       : 1
 Outstand reads  : 16
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0000 QPN 0x1659 PSN 0xd4858a OUT 0x10 RKey 0x203e00 VAddr 0x007f38d0d07000
 GID: 254:128:00:00:00:00:00:00:176:252:78:255:254:179:17:18
 remote address: LID 0000 QPN 0x1c86 PSN 0xc2e51a OUT 0x10 RKey 0x013f00 VAddr 0x007f123fc62000
 GID: 254:128:00:00:00:00:00:00:100:155:154:255:254:172:09:41
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[MiB/sec]    BW average[MiB/sec]   MsgRate[Mpps]
 65536      1000             10829.53            10829.17       0.173267
---------------------------------------------------------------------------------------

If I test on the example code, the Bandwidth is about 0.0124GB/s when M1 use GID0 and M2 use GID0/GID1. And the Bandwidth is about 6GB/s when M1 use GID1 and M2 use GID1. I’d like to know what optimizations the perftest code has done, or what deficiencies the code in the example above has caused a big difference in the bandwidth of the tests.

Topic		Replies	Views
linux-rdma perftest ib_read_bw failure with use_cuda option Software And Drivers iterations , bytes	0	1624	June 17, 2021
Poor bandwidth performance when running with large block size Ethernet Adapter Cards iterations , bytes	9	1657	May 28, 2018
Unable to set up RDMA connection between ConnectX-5 and ConnectX-6 on smartnic InfiniBand/VPI Adapter Cards	1	2523	November 3, 2020
Infiniband performance tuning InfiniBand/VPI Adapter Cards iterations , bytes	2	1447	June 7, 2017
Segfault when testing GPU RDMA on Connectx-7 IB RDMA Software For GPU cuda , gpu , rdma-and-roce , infiniband , mellanox-ofed	1	42	May 8, 2025
RDMA not working with ConnectX-6 Software And Drivers iterations , bytes	2	9258	January 29, 2022
Issues with ConnnectX-6 Throughput Under Infiniband InfiniBand/VPI Adapter Cards	7	1128	November 16, 2023
ConnectX-4 RoCE speed less than expected Ethernet Adapter Cards	7	1680	July 14, 2023
40GbE optimization and bandwidth testing Ethernet Adapter Cards	17	1177	May 14, 2013
RDMA GPU Direct Slow CUDA Programming and Performance	10	2476	February 13, 2019

The bandwidth used to test the code is not the same as that tested by perftest

Related topics