FEC Mode on Mellanox MQM8700

Hi,
i have a Mellanox MQM 8700 Infiniband Switch.
When i run ibdiagnet and look at the

cat /var/tmp/ibdiagnet2/ibdiagnet2.net_dump

I can see that all ConnectX6 nodes are negotiated with the FEC Mode:

MLNX_RS_544_514_PLR which is leading to PLR Retran.

The nodes which have a ConnectX5 Card are negotiated with STD-LL-RS which is leading to NO-RTR for Retran.

: IB# : Sta : PhysSta : MTU : LWA : LSA : FEC mode : Retran : Neighbor Guid : N# : NLID : Neighbor Description

1 : 1 : DOWN : POLL : N/A : N/A : N/A : N/A : N/A : : : :
2 : 2 : ACT : LINK UP : 5 : 4x : 25 : STD-LL-RS : NO-RTR : 0xb83fd20300be963a : B129D0F0/0/0 : 11 : “bee-meta HCA-1”

6 : 6 : ACT : LINK UP : 5 : 4x : 25 : STD-LL-RS : NO-RTR : 0xb83fd20300be95fa : B42D0F0/0/0 : 10 : “gpu0 HCA-1”
7 : 7 : ACT : LINK UP : 5 : 4x : 25 : STD-LL-RS : NO-RTR : 0xb83fd20300be963b : B129D0F1/1/1 : 4 : “master mlx5_0”
8 : 8 : ACT : LINK UP : 5 : 4x : 25 : STD-LL-RS : NO-RTR : 0xb83fd20300be95d2 : B129D0F0/0/0 : 12 : “login mlx5_0”

17 : 17 : ACT : LINK UP : 5 : 4x : 50 : MLNX_RS_544_514_PLR : PLR : 0xb8cef60300f8bf42 : B99D0F0/0/0 : 13 : “gpu2 HCA-1”
18 : 18 : ARM : LINK UP : 5 : 4x : 50 : MLNX_RS_544_514_PLR : PLR : 0xb83fd20300721c36 : B99D0F0/0/0 : 8 : “node3 mlx5_0”
19 : 19 : ACT : LINK UP : 5 : 4x : 50 : MLNX_RS_544_514_PLR : PLR : 0xb83fd20300a6c50c : B99D0F0/0/0 : 7 : “gpu1 HCA-1”
20 : 20 : ACT : LINK UP : 5 : 4x : 50 : MLNX_RS_544_514_PLR : PLR : 0xb83fd20300a6c81c : B99D0F0/0/0 : 3 : “node4 mlx5_0”
21 : 21 : ACT : LINK UP : 5 : 4x : 25 : STD-LL-RS : NO-RTR : 0xb83fd20300be8bc2 : B129D0F0/0/0 : 6 : “beegfs-storage1 HCA-1”
22 : 22 : ACT : LINK UP : 5 : 4x : 50 : MLNX_RS_544_514_PLR : PLR : 0xb8cef603003133be : 1 : 9 : “hpc-nfs mlx5_0”

The problem is that when i do tests with dd to write on the beegfs storage i get 6Gb/s with the ConnectX5 Cards which are using Read Solomon FEC Mode. With the Connect6X Cards i only get around 3Gb/s.

The FEC Mode usually should negotiated for the COnnect6X Cards also with STD-LL-RS as i am using copper cables which are not longer than 2 meter.

Version of switch:

Module Device Version

MGMT QTM 27.2000.2626

[standalone: master] (config) # show version
Product name: MLNX-OS
Product release: 3.8.2102
Build ID: 1-dev
Build date: 2019-11-26 21:48:40
Target arch: x86_64
Target hw: x86_64
Built by: jenkins@c776fa44be2b
Version summary: X86_64 3.8.2102 2019-11-26 21:48:40 x86_64

Product model: x86onie
Host ID: B8CEF61C83A6
System serial num: MT2102J18745
System UUID: fac8642e-55d0-11eb-8000-b8cef61db19a

Thanks in Advance

Kind Regards

Hello mlnxqm8700,

Thank you for posting your inquiry on the NVIDIA Developer Forum - Infrastructure and Networking - Section.

FEC mode in Infiniband on our switches and adapter is an automatic negotiation based on cable length and type. It is a process which is achieved by the PLR layer.

The most important aspect to make sure the link is negotiated properly is to align the f/w versions between the switch and the HCA. In this case you are running out-dated switch code, in which recent versions we improved PLR. As the QM8700 and the ConnectX-6 are HDR components the FEC negotiation is different then the ConnectX-5 which is EDR. Different adapters, arch, different behavior.

The performance issue you are experiencing looks more related to node tuning then link related issues, as you mentioned you only use short length cables.

First step, align all MLNX-OS, f/w and driver code and then implement the proper node tuning based on the CPU used in the nodes. Intel and AMD nodes require different tuning settings.

Thank you and regards,
~NVIDIA Networking Technical Support

Hello,

thanks for the response do have a link to a documentation because of the Node Tuning ?

Kind Regards

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.