Speed problem on new MT28908 cards

We have just racked up 16 new Dell compute nodes with Infiniband cards that identify as MT28908:

e2:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]

These are connected to an unmanaged FDR-capable switch identifying as a ‘SwitchX - Mellanox Technologies’.

All the new nodes only achieve SDR speeds:

[root@node098 ~]# ibstatus

Infiniband device ‘mlx5_0’ port 1 status:

default gid: fe80:0000:0000:0000:1c34:da03:0050:6600

base lid: 0x2c4

sm lid: 0xa8

state: 4: ACTIVE

phys state: 5: LinkUp

rate: 10 Gb/sec (4X SDR)

link_layer: InfiniBand

Old nodes attached to the same switch get FDR.

I’ve upgraded the card firmware without affecting anything. mlxlink says

[root@node098 ~]# mlxlink -d e2:00.0

Operational Info

State : Active

Physical state : LinkUp

Speed : IB-SDR

Width : 4x

FEC : Firecode FEC

Loopback Mode : No Loopback

Auto Negotiation : ON

Supported Info

Enabled Link Speed : 0x00000011 (FDR,SDR)

Supported Cable Speed : 0x0000001f (FDR,FDR10,QDR,DDR,SDR)

Troubleshooting Info

Status Opcode : 35

Group Opcode : PHY FW

Recommendation : The active speed was degraded from maximal possible speed due to peer signal integrity issue.

Any suggestions for how to troubleshoot this?

Hello Martin,

Thank you for posting your inquiry on the NVIDIA Networking Community.

Based on the information provided, please make sure that all ConnectX-6 VPI HCA’s and the FDR switches are running the latest f/w available. You can download the latest f/w through the following link → https://www.mellanox.com/support/firmware/firmware-downloads

Furthermore, when connecting HDR components to FDR components, please make sure you are following the recommended connectivity matrix → https://docs.mellanox.com/display/ConnectX6Firmwarev20281002/Firmware+Compatible+Products#FirmwareCompatibleProducts-SwitchandHCAsInfiniBandCableConnectivityMatrix

This matrix also provides the information on the supported cables you need to use, to connect the ConnectX-6 VPI HCA to a FDR switch (Switch-X) → https://docs.mellanox.com/display/ConnectX6Firmwarev20281002/Firmware+Compatible+Products#FirmwareCompatibleProducts-ValidatedandSupportedFDRCables

Just to be aware when using a ConnectX-6 Dual-port VPI ( 1 port IB/ 1 port Eth.), FDR speed/connectivity is not supported and the link will come up as QDR/SDR (As mentioned in the section “VPI Protocol Support”)

Thank you and regards,

~NVIDIA Networking Technical Support