Hi,
I have one server (Centos 7.9) with 2 infiniband ConnectX-7 single port adapters (MCX75310AAS-NEAT) and one QM9790 IB switch. The port of each CX7 adapter has a 400G OSFP112 DR4 optical module. One port of the switch has a 800G OSFP DR8 optical module. Two single mode fibers are used to connecting the 800G module and two 400G modules. OFED driver version: MLNX_OFED_LINUX-5.9-0.5.6.0-rhel7.9-x86_64.
Upon running OpenSM on the server, ibstat command shows that the state of each port on the two adapters is Down, and mlxlink command shows that Datapath failed to reach DPactivated after apply.
[root@localhost admin]# ibstat
CA 'mlx5_0'
CA type:MT4129
Number of ports: 1
Firmware version: 28.36.1010
Hardware version:0
Node GUID:0xa088c2030004a0e8
System image GUID:0xa088c2030004a0e8
Port 1:
State:Down
Physical state: Polling
Rate:10
Base lid:65535
LMC:0
SM lid:0
Capability mask: 0xa651e84a
Port GUID: 0xa088c2030004a0e8
Link layer:InfiniBand
CA 'mlx5_1'
CA type:MT4129
Number of ports: 1
Firmware version: 28.36.1010
Hardware version: 0
Node GUID:0xa088c2030004a050
System image GUID:0xa088c2030004a050
Port 1:
State: Down
Physical state: Polling
Rate:10
Base lid:65535
LMC:0
SM lid:0
Capability mask: 0xa651e848
Port GUID: 0xa088c2030004a050
Link layer: InfiniBand
[root@localhost admin]#
[root@localhost admin]# mlxlink-d mlx5_0-m
Operational Info
----------------
State : Polling
Physical state : ETH_AN_FSM_ENABLE
Speed : N/A
width : N/A
FEC : N/A
Loopback Mode : No Loopback
Auto Negotiation : ON
Supported Info
--------------
Enabled Link Speed :0x00000080 (NDR)
Supported Cable Speed :0x00000080(NDR)
Troubleshooting Info
--------------------
Status Opcode : 1048
Group Opcode : MNG FN
Recommendation : Datapath failed to reach DPactivated after aplly
Tool Information
----------------
Firmware Version : 28.36.1010
amBER Version : 2.09
MFT Version : mft 4.23.0-104
Module Info
-----------
Identifier : OSFP
Compliance : ,400GBASE-DR4
Cable Technology : 1310 nm DFB
Cable Type : Optical Module (separated)
OUI : Other
Vendor Name : HYPERSILICON
Vendor Part Number : ZRSL-OSFP-400DR4
Vendor Serial Number : N/A
Rev : A
For each 400G module, the TXP and RXP are between -2 dBm and 0 dBm, and each lane’s state of Data Path is DPactivated.
What could be wrong? How to locate and fix the problem?
Thanks,
Lang