mlx4_core 0000:02:00.0: command 0x24 failed: fw status = 0x30

Hi

Getting “mlx4_core 0000:02:00.0: command 0x24 failed: fw status = 0x30” in dmesg on the IB nodes.

Does anyone know what “command 0x24” referring to ?

Thank you

Eric

The time out for “command 0x24 (go bit not cleared)” is MAD operation, and it is issued by firmware.

this “go bit” error indicates that the firmware is trying to execute a MAD command while it’s queue is full/hang. Usually upgrading the NICs firmware to the compatible & latest fw will resolve the issue