HDR Infiniband and ConnectX-6 VPI interfaces

Using infiniband-diags (35), we see ~twice the latency rate is the equipment is rated for:

[root@ne09 ~]# ib_send_lat ne09-ib

#bytes #iterations t_min[usec] t_max[usec] t_typical[usec] t_avg[usec] t_stdev[usec] 99% percentile[usec] 99.9% percentile[usec]

2 1000 1.42 3.63 1.46 1.47 0.05 1.55 3.63

2 1000 1.23 2.97 1.29 1.31 0.06 1.42 2.97

2 1000 1.23 3.41 1.28 1.30 0.08 1.40 3.41

2 1000 1.29 3.93 1.41 1.41 0.08 1.50 3.93

2 1000 1.19 3.12 1.22 1.23 0.06 1.31 3.12

We are not seeing the scaling performance in our application, so this led us to investigate the latency.

The interfaces are rated at 0.6usec, but we are seeing 1.2-1.4 (from above)

This is with both the OFED mlx5.5 from Mellanox, and mlx5.0 drivers from CentOS 8.2.

Which tool is recommended to measure latency? Am I interpreting the output of ib_send_lat correctly?

Thanks in advance,

Anne Hammond

Hi Mammond

Your understanding is right. But you need to add more to latency calculations.

ex) if you lab Topo like below.

CX6-------switch------switch------CX6

ex) CX6 (600)— switch port to port latency (90)----switch port to port latency (90)----CX(6) + Cable latency (in case Fiber 5ns per 1M) + 2Byte serialization/de-serialization.

i think 1.2 - 1.4 is reasonable.

But if you want to measure it more accurate, please refer to below, “Performance Tuning for Mellanox Adapter”

Performance Tuning for Mellanox Adapters | Salesforce