Windows failed to initialise Mellanox MCX354A-FCBT

Hello all,

i own two HP 649281-B21/Mellanox MCX354A-FCBT and run it on two MSI Z490 Mainboards.

Both boards supports PCIe3.0 4x slots via PCH intel chipset.

One system run with Server 2019 Essentials and the other one is Windows 10 Pro.

Both Machines working so far but i get sporadic problems at system starts and the cards show error code 43 in device manager.

I would say 10 starts good and 1 fail. After reboot the problem is gone and the cards working very good again. The problem occur at both system in the same way.

I am using the last mainboard bios, HPE firmware 2.42 and last mellanox drivers for Server 5.50.54000 / Client 5.50.53000. Have set both cards to HCE Port Typ “ETH”

Windows show me this errors in event logging when the problem occur

  • Native_3_0_0: Execution of FW command failed. op 0xfff, status 0x1, errno -5, token 0xffff, in_modifier 0x100, op_modifier 0, in_param 229ff000.

  • Native_3_0_0: MAP_FA command failed with error -5. The adapter card is non-functional. Most likely a FW problem. Please burn the last FW and restart the mlx4_bus driver.

  • Native_3_0_0: Driver startup failed because the hca could not be initialized.

In fact the problem occur simply at bootup’s i suppose an issue with BIOS PCIe assignment or driver/chipset communication at OS level.

I am not so skilled with FW tweaking at the Mellanox cards. Maybe there are several options to adapt the FW before flashing to the card.

May it can help to disable some features which are not supported at the Z490 Boards (for example “SR-IOV”) !?!

@Martijn van Breugel​

I used the lastest HPE Firmware 2.42 ( link you posted) already when the problem occur.

I did a swap to “fw-ConnectX3-rel-2_42_5000-MCX354A-FCB_A2-A5-FlexBoot-3.4.752.bin” to ensure there is no problem with the OEM stuff.

Now with Mellanox firmware the behavior is 100% the same.

It run well for a couple of reboots than the card not initialise again with error code 43 in device manager. Another reboot solve the issue again. This happen at Server & Client the same way. I using the lastest drivers for both machines Server/Client.

It seams the SR-IOV is not the reason for that.

After firmware burn i always reset the mlxconfig.

Both FW having a little difference in defaults only ( yes, i set the linktype to ETH because no IB is used in this mesh):

fw-ConnectX3-rel-2_42_5000-649281-B21_Bx-CLP-8025-FlexBoot-3.4.752_windows

Device type: ConnectX3

Device: mt4099_pciconf0

Configurations: Next Boot

SRIOV_EN True(1)

NUM_OF_VFS 16

LINK_TYPE_P1 ETH(2)

LINK_TYPE_P2 ETH(2)

LOG_BAR_SIZE 5

fw-ConnectX3-rel-2_42_5000-MCX354A-FCB_A2-A5-FlexBoot-3.4.752.bin

Device type: ConnectX3

Device: mt4099_pciconf0

Configurations: Next Boot

SRIOV_EN False(0)

NUM_OF_VFS 8

LINK_TYPE_P1 ETH(2)

LINK_TYPE_P2 ETH(2)

LOG_BAR_SIZE 3

Windows and Intel system drivers are all up to date. Setting of mellanox drivers are default… i have no more ideas to get this solved 😥

Maybe it points out a bad incompatibility with the Z490 hardware.

I wondering why the problem just appear in sporadic nature…

Hello Max,

Thank you for posting your inquiry on the NVIDIA Networking Community.

Based on the information provided, you want to disable SR-IOV on the adapter as it is not supported on the systemboard.

For disabling SR-IOV on the adapter, make sure you install the latest MFT Tools for Windows → https://www.mellanox.com/downloads/MFT/WinMFT_x64_4_16_1_9.exe

Through the command-line window you can disable SR-IOV through the following syntax:

c:\Program Files\Mellanox\WinMFT> mst.exe status

c:\Program Files\Mellanox\WinMFT> mlxconfig.exe -d set SRIOV_EN=0

After this you need to reboot the node.

If this does not resolve the issue, please reach out to HPE Support as the adapters you have are HPE Mellanox OEM adapters. HPE maintains the f/w for those adapters.

Thank you and regards,

~NVIDIA Networking Technical Support

Hello Martijn, just looking for a solution to tweak the card firmware to get better compatibility with my Z490 hardware. The SR-IOV is just an example.

Along the event log i found this reoccuring messages:

  • Mellanox ConnectX-3 VPI (MT04099) Network Adapter (PCI bus 3, device 0, function 0): SR-IOV cannot be enabled because FW does not support SR-IOV. In order to resolve this issue please re-burn FW, having added parameters related to SR-IOV support.

  • Native_3_0_0: EXT_QP_MAX_RETRY_LIMIT/EXT_QP_MAX_RETRY_PERIOD registry keys were requested by user but FW does not support this feature. Please upgrade your firmware to support it. For more details, please refer to WinOF User Manual.

I am not sure if this can be the reason that the cards are sporadic not initialise…

@Max Mayer​

Changing the PSID on the adapter, will void any support and warranty. Why can you not burn the adapter with the HP f/w → https://downloads.hpe.com/pub/softlib2/software1/pubsw-linux/p1465926780/v147811/fw-ConnectX3-rel-2_42_5000-649281-B21_Bx-CLP-8025-FlexBoot-3.4.752.tgz. The f/w should by default have SR_IOV enabled. Else you have an option to go into the Flexboot Menu and enable SR-IOV (Change Virtualization Mode to on)

But I would not change the PSID of the adapter. It will not benefit you.

Thank you,

~Martijn

@Max Mayer​

Unfortunately, we cannot test every system board which comes on the market. Based on all the tests you did and being on the latest driver and f/w, we need to assume that this is more a compatibility problem with the system board,. Any change you are able to test in a different system board manufacturer? Also I would recommend to flash the adapter back to the original OEM PSID as it voids warranty and support.

Thank you,

~Martijn

Hi Martijn, understand. Yes it seems to be a compatibility issue and I hope that MSI will release new UEFI BIOS and Intel chipset driver soon…

In my case i have 2 MSI Z490 Boards with the same problems.

The initializing issue (error code 43 in device manager) is not a special/unique thing to me.

I found also users reported the same problems with MCX311A-XCAT cards and Windows OS.

Let’s see what the Mainboard manufacturer with update in the future.

But thanks for support Martijn…