Colin
April 26, 2021, 6:42pm
1
I have two ConnectX-5 direct connect each other but the physical state shows Polling. It seems the mlx related drivers are loaded correctly. I’m new to IB. What did I miss? Thanks.
$ ibstat
CA ‘mlx5_1’
CA type: MT4119
Number of ports: 1
Firmware version: 16.27.1016
Hardware version: 0
Node GUID: 0x98039b0300862f75
System image GUID: 0x98039b0300862f74
Port 1:
State: Down
Physical state: Polling
Rate: 10
Base lid: 65535
LMC: 0
SM lid: 0
Capability mask: 0x2651e848
Port GUID: 0x98039b0300862f75
Link layer: InfiniBand
$ lsmod | grep mlx
mlx5_ib 410355 0
mlx5_core 1073216 1 mlx5_ib
mlxfw 19917 1 mlx5_core
ib_uverbs 141173 10 mlx5_ib,rdma_ucm
ib_core 377570 8 ib_cm,rdma_cm,ib_umad,ib_uverbs,ib_ipoib,iw_cm,mlx5_ib,rdma_ucm
mlx_compat 45899 10 ib_cm,rdma_cm,ib_umad,ib_core,ib_uverbs,ib_ipoib,mlx5_core,iw_cm,mlx5_ib,rdma_ucm
Colin
April 26, 2021, 11:49pm
2
I was able to solve it by first upgrading the firmware and then loading opensm. The link state becomes “Active” and physical state “LinkUp”. My question is the card and cable should support 100Gbps but why “Rate” shows 10?
CA ‘mlx5_1’
CA type: MT4119
Number of ports: 1
Firmware version: 16.30.1004
Hardware version: 0
Node GUID: 0x98039b03008f93bd
System image GUID: 0x98039b03008f93bc
Port 1:
State: Active
Physical state: LinkUp
Rate: 10
Base lid: 1
LMC: 0
SM lid: 2
Capability mask: 0x2651e84a
Port GUID: 0x98039b03008f93bd
Link layer: InfiniBand
Colin
April 27, 2021, 2:52am
3
Somehow I had to run mlxconfig to explicitly set link type to Ethernet then I’m able to get 100Gbps data rate. And if I set link type back to Infiniband it goes 10Gbps again. Is this what Mellanox card/software by design?
MvB
April 27, 2021, 11:03pm
4
Hello Colin,
Thank you for posting your inquiry on the NVIDIA Networking Community.
Based on the updates your were able to start the Subnet Manager, which is needed in every IB fabric.
You need to run the SM on one node only if connected back2back. As you mentioned you were able to get a rate of 100GbE, suspicion is that you are using an Ethernet cable. For achieving 100Gb/s IB, you need to use the correct cable. You can find the list of supported IB cables through the RN of the f/w on the adapter → https://docs.mellanox.com/display/ConnectX5Firmwarev16301004/Firmware+Compatible+Products#FirmwareCompatibleProducts-ValidatedandSupportedEDR/100Gb/sCables
Thank you and regards,
~NVIDIA Networking Technical Support
Colin
April 27, 2021, 11:57pm
5
Thank you Martijn. I will get one of the IB cables and try again.
Colin
April 30, 2021, 8:11pm
6
Yes, you’re right. it is cable. Once I change to IB cable rate becomes 100. Thanks!