Migration using RDMA with Mellanox connectx3-pro EN

Hello everyone,

I have encountered a problem regarding the migration process using RDMA, which seems to have a longer total time compared to TCP.

Here are the relevant details:

  • Network configuration: Mellanox ConnectX-3 Pro EN 40G NIC
  • Software versions: QEMU 4.2.0, kernel 4.4.0

Steps taken:

  1. Installed MLNX_OFED from NVIDIA as instructed.
  2. Compiled QEMU with the “–enabled-rdma” flag to ensure the presence of librdmacs and libibverbs interfaces.
  3. Verified bandwidth using iperf for both RDMA and TCP. RDMA achieved 26 Gbits/sec, while TCP reached 20 Gbits/sec, demonstrating the expected superiority of RDMA.
  4. During the migration of a 30GB VM at the maximum migration speed of 40G, I observed the following migration times:
    • RDMA: 13531ms
    • TCP: 13483ms

I’m uncertain about what might be causing this discrepancy. Any insights or suggestions would be greatly appreciated. Thank you.