Mellonox_switch_mellonx performance Issues

Hi all,

we are working with the setup having the 3 mellanox cards(versions are mlx5_0)are connected over the Mellonox switch(MSN2100-CB2F Model MSN2100 Spectrum™ based 100GbE, 1U Open Ethernet Switch with MLNX-OS, 16 QSFP28 ports,2AC PSUsx86 2Core),

when we pump the traffic from two Mellonox hosts(In client mode) to one Mellonox host (in server mode )using the perftest(ib_write_bw)we are observed one of the Mellonox client,we observed the performance drop was ~3 to 4 Gbps and the other Mellonox client was giving ~85 Gbps.

we are also observed the one of the Mellonox client have the pkt_Seq_err counter also changing from iteration to iteration.

This experiment we are carried with in all the mellnox hosts PFC was enabled.

Can anyone suggest me why this was happening.

Hi Pkashire,

have you enabled any sort of congestion control mechanism on the switch side?
not sure if you are using Onyx or Cumulus but please review:

https://docs.nvidia.com/networking-ethernet-software/cumulus-linux-50/Layer-1-and-Switch-Ports/Quality-of-Service/RDMA-over-Converged-Ethernet-RoCE/

Or
Onyx:

https://docs.nvidia.com/networking/pages/viewpage.action?pageId=71023238