RDMA performance improvement with VMA offload

I hope this post finds you well.

I am able to see the performance improvement of sockperf through linking the VMA userspace library, however, when I try the VMA offload with an RDMA application like “ibv_uc_pingpong” with IBV_FORK_SAFE set to 1, I’m not able to see any performance improvements. In fact, for the ibv_uc_pingpong application, average latency and throughput with VMA are 3x worse than without VMA.

My client and server both have a 100G ConnectX-6 Dx EN adapter card (MT4125 - MCX623106AN-CDAT).

MLNX_OFED Version: 4.9-

VMA Version: 9.0.2-1

The VMA User Manual mentions that code implemented with the native RDMA verbs API can be run with the VMA library to present the standard socket API to the application. However, there is no performance improvement with the ibv_uc_pingpong application.

I’d really appreciate any clarifications, tips, or advice in this regard.

Thank you, in advance!


hi Hamed:

Thank you for contacting NVIDIA Technologies Technical Support.

For VMA performance test, we suggest you follow below guidance:


Best Regards