CX5 25G cards not linking up with Cisco ACI randomly

CX5 25G cards not linking up with Cisco ACI randomly -->CX5 cards with firmware 16.33.1048

The issue is ports are getting in [close port state ] and not releasing even after reboot/ even after shut and no shut on the switch ports. Also - other command like Mlxlink -d -a DN/ Mlxlink -d -a UP and OS ifup /ifdown does not down/up the interface. Its stuck at CLOSE PORT start.

What does [STATE - CLOSE PORT] mean ? And how can we get out of this state?

mlxlink -d 0000:xxxxx

Operational Info

State : Close port

Physical state : ETH_AN_FSM_ENABLE
Speed : N/A
Width : N/A
FEC : N/A
Loopback Mode : No Loopback
Auto Negotiation : ON

Supported Info

Enabled Link Speed : 0x38007013 (25G,10G,1G)
Supported Cable Speed : 0x38007013 (25G,10G,1G)

Thanks

1 Like

Hello joe.maina4u,

Thank you for posting your query on our community. The port can be down due to various reasons. Please try the below troubleshooting steps and see if it resolves the issue:

Please let us know if it helps resolve the issue.

Thanks,
-Nvidia Network Support

1 Like

I’ve the exact same problem (not sure about the port state) with different CX 25G adapters (OEM HPE) for over a year now, mostly/only on Linux. Only removing the power completely from the server solved this is some cases. After months of trial and error we got feedback from Cisco and they pointed to the following Nvidia document.

https://enterprise-support.nvidia.com/s/article/3rd-Party-Switches-Link-Is-Down-Due-to-Auto-Negotiation

Network team now changed DFE delay and we have to see if it reappears or not. But as the experience with different types of Mellanox adapters was really bad during the last 18 months, we will not use any of them in new servers.

Thanks