4k IOPS regression when using both ports of CX6/7 NICs?

Running performance benchmarks against an NVME-RoCE array and notice that while throughput increases linearly as I increase NIC port usage for large IO sizes (128KB), performance decreases when leveraging the 2nd port of a dual port CX6 or 7 device. In the same host, if I simply instead leverage a single port of another NIC, I achieve nearly double the throughput.

Linux host has (2) dual port CX6 (have also experienced this on CX7) NICs, connected via 100G to Arista switch, which is subsequently connected to 16 x 100G ports on the array.

You can see linear increase in throughput as ports increase, regardless of port location for large IO (as expected)

IO-size . Ports . NICs . Throughput (MiB/s)
128KB . 1 . . . . . 1 . . . . 11,669
128KB . 2 . . . . . 1 . . . . 23,335
128KB . 2 . . . . . 2 . . . . 23,335
128KB . 4 . . . . . 2 . . . . 46,674

However, for 4K, it is detrimental to use both ports

IO-size . Ports . NICs . Throughput (MiB/s)
4KB . . . .1 . . . . . 1 . . . . 11,226
4KB . . . .2 . . . . . 1 . . . . 10,996
4KB . . . .2 . . . . . 2 . . . . 22,036
4KB . . . .4 . . . . . 2 . . . . 20,170

Given this, if the workload is mainly small IO’s, the usefulness of the secondary port is diminished.

Has anyone else experienced this, and if so, any tuning that can overcome it?

Hi sarich,

Welcome, and thanks for posting your inquiry to the NVIDIA Developer Forums!

We highly recommend reviewing the adapter tuning guide:
https://nvcrm.my.site.com/ESPCommunity/s/article/performance-tuning-for-mellanox-adapters

And reviewing the RoCE network deployment recommendations guide here:
https://nvcrm.my.site.com/ESPCommunity/s/article/recommended-network-configuration-examples-for-roce-deployment

We would also recommend checking raw p2p RDMA performance (IE: via ib_write_bw) to help isolate the issue.

If you have Enterprise Support entitlement, we would also recommend opening a support case so we can assist you further.

Best regards,
NVIDIA Enterprise Experience.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.