Hardware Details
The model number of the card is MCX653106A-ECAT-SP.
Here’s the output from lspci | grep -i mellanox
:
ca:00.0 Ethernet controller: Mellanox Technologies MT28908 Family [ConnectX-6]
ca:00.1 Ethernet controller: Mellanox Technologies MT28908 Family [ConnectX-6]
Firmware Details
And the firmware details obtained using mstflint
:
[root@localhost ~]# mstflint -d ca:00.0 q
Image type: FS4
FW Version: 20.43.1014
FW Release Date: 7.11.2024
Product Version: 20.43.1014
Rom Info: type=UEFI version=14.36.16 cpu=AMD64,AARCH64
type=PXE version=3.7.500 cpu=AMD64
Description: UID GuidsNumber
Base GUID: a088c20300aae43d 8
Base MAC: a088c2aae43d 8
Image VSD: N/A
Device VSD: N/A
PSID: MT_0000000224
Security Attributes: N/A
**Error Message**
Line 7140: [Thu May 15 16:08:59.475 2025] [ 6.160172] ERST: Error Record Serialization Table (ERST) support is initialized.
Line 7881: [Thu May 15 16:09:04.297 2025] [ 36.132204] mlx5_core 0000:ca:00.0: print_health_info:431:(pid 0): Health issue observed, firmware internal error, severity(3) ERROR:
Line 7894: [Thu May 15 16:09:04.554 2025] [ 36.132337] mlx5_core 0000:ca:00.0: print_health_info:445:(pid 0): severity 3 (ERROR)
Line 7896: [Thu May 15 16:09:04.554 2025] [ 36.132359] mlx5_core 0000:ca:00.0: print_health_info:447:(pid 0): synd 0x1: firmware internal error
Line 7900: [Thu May 15 16:09:04.571 2025] [ 36.773200] mlx5_core 0000:ca:00.1: print_health_info:431:(pid 0): Health issue observed, firmware internal error, severity(3) ERROR:
Line 7913: [Thu May 15 16:09:04.586 2025] [ 36.773302] mlx5_core 0000:ca:00.1: print_health_info:445:(pid 0): severity 3 (ERROR)
Line 7915: [Thu May 15 16:09:04.586 2025] [ 36.773320] mlx5_core 0000:ca:00.1: print_health_info:447:(pid 0): synd 0x1: firmware internal error
Line 10810: [Thu May 15 16:11:12.147 2025] [ 6.290428] ERST: Error Record Serialization Table (ERST) support is initialized.