i own two HP 649281-B21/Mellanox MCX354A-FCBT and run it on two MSI Z490 Mainboards.
Both boards supports PCIe3.0 4x slots via PCH intel chipset.
One system run with Server 2019 Essentials and the other one is Windows 10 Pro.
Both Machines working so far but i get sporadic problems at system starts and the cards show error code 43 in device manager.
I would say 10 starts good and 1 fail. After reboot the problem is gone and the cards working very good again. The problem occur at both system in the same way.
I am using the last mainboard bios, HPE firmware 2.42 and last mellanox drivers for Server 5.50.54000 / Client 5.50.53000. Have set both cards to HCE Port Typ “ETH”
Windows show me this errors in event logging when the problem occur
Native_3_0_0: Execution of FW command failed. op 0xfff, status 0x1, errno -5, token 0xffff, in_modifier 0x100, op_modifier 0, in_param 229ff000.
Native_3_0_0: MAP_FA command failed with error -5. The adapter card is non-functional. Most likely a FW problem. Please burn the last FW and restart the mlx4_bus driver.
Native_3_0_0: Driver startup failed because the hca could not be initialized.
In fact the problem occur simply at bootup’s i suppose an issue with BIOS PCIe assignment or driver/chipset communication at OS level.
I am not so skilled with FW tweaking at the Mellanox cards. Maybe there are several options to adapt the FW before flashing to the card.
May it can help to disable some features which are not supported at the Z490 Boards (for example “SR-IOV”) !?!
I used the lastest HPE Firmware 2.42 ( link you posted) already when the problem occur.
I did a swap to “fw-ConnectX3-rel-2_42_5000-MCX354A-FCB_A2-A5-FlexBoot-3.4.752.bin” to ensure there is no problem with the OEM stuff.
Now with Mellanox firmware the behavior is 100% the same.
It run well for a couple of reboots than the card not initialise again with error code 43 in device manager. Another reboot solve the issue again. This happen at Server & Client the same way. I using the lastest drivers for both machines Server/Client.
It seams the SR-IOV is not the reason for that.
After firmware burn i always reset the mlxconfig.
Both FW having a little difference in defaults only ( yes, i set the linktype to ETH because no IB is used in this mesh):
Through the command-line window you can disable SR-IOV through the following syntax:
c:\Program Files\Mellanox\WinMFT> mst.exe status
c:\Program Files\Mellanox\WinMFT> mlxconfig.exe -d set SRIOV_EN=0
After this you need to reboot the node.
If this does not resolve the issue, please reach out to HPE Support as the adapters you have are HPE Mellanox OEM adapters. HPE maintains the f/w for those adapters.
Hello Martijn, just looking for a solution to tweak the card firmware to get better compatibility with my Z490 hardware. The SR-IOV is just an example.
Along the event log i found this reoccuring messages:
Mellanox ConnectX-3 VPI (MT04099) Network Adapter (PCI bus 3, device 0, function 0): SR-IOV cannot be enabled because FW does not support SR-IOV. In order to resolve this issue please re-burn FW, having added parameters related to SR-IOV support.
Native_3_0_0: EXT_QP_MAX_RETRY_LIMIT/EXT_QP_MAX_RETRY_PERIOD registry keys were requested by user but FW does not support this feature. Please upgrade your firmware to support it. For more details, please refer to WinOF User Manual.
I am not sure if this can be the reason that the cards are sporadic not initialise…
Unfortunately, we cannot test every system board which comes on the market. Based on all the tests you did and being on the latest driver and f/w, we need to assume that this is more a compatibility problem with the system board,. Any change you are able to test in a different system board manufacturer? Also I would recommend to flash the adapter back to the original OEM PSID as it voids warranty and support.