MT27710 Family ConnectX-4 Lx usually fails to start Error Code 10 WinOF2

Windows 10 22H2 : MT27710 fails to start with Error Code 10. Fails intermittently and workaround is to reboot until success. There are plenty of tools as part of the driver installation for WinOF2. Not familiar how to debug this condition but is there a driver startup log?

Ubuntu 24.10 LTS Flash Boot mlx5_core driver starts consistently without issue. Same hardware. No BIOS changes of any kind. ASRock Z690 PG Riptide /w Intel i5-12600K.

Driver Version : 3.10.25798.0
Firmware Version : 14.32.1010
Port Number : 1
Bus Type : PCI-E 8.0 GT/s x4
Link Speed : 25.0 Gbps/Full Duplex
Part Number : 0PP10R
Device Id : 4117
Revision Id : 0
Current MAC Address : 24-8A-07-XX-XX-XX
Permanent MAC Address : 24-8A-07-XX-XX-XX
Network Status : Connected
Adapter Friendly Name : Ethernet
Port Type : ETH
IPv4 Address #1 : 192.168.1.10
IPv6 Address #1 : fe80::XXXX:XXXX:XXXX:XXXX%7

Looks like we’re in hard to diagnose mode. Are Mellanox BAR features known to conflict with motherboard Clever Memory Access? Grasping at straws but hopefully this will trigger a thought from someone.

Seems clear that Ubuntu driver is handling context better, for whatever reason, compared to Windows 10 22H2 driver.

mlxconfig -d mt4117_pciconf0 query" shows a few BAR configuration parameters:

  • NON_PREFETCHABLE_PF_BAR
  • PF_LOG_BAR_SIZE
  • VF_LOG_BAR_SIZE
  • MEMIC_BAR_SIZE

Driver Version : 3.10.25798.0
Firmware Version : 14.32.1010

I suggest first to install the latest NVIDIA WinOF-2 and ConnextX-4 Lx firmware versions.

WinOF-2 v25.1.50020 - WinOF-2 / WinOF Drivers
Firmware v14.32.1900 - Firmware for ConnectX®-4 Lx EN

Will monitor effects of this change. Thank you for the suggestion.

C:\Program Files\Mellanox\WinMFT\windows_x64>mlxup.exe

Querying Mellanox devices firmware ...
Device Type:      ConnectX4LX
Part Number:      MCX4111A-ACA_Ax
Description:      ConnectX-4 Lx EN network interface card; 25GbE single-port SFP28; PCIe3.0 x8; ROHS R6
PSID:             MT_2410110034
PCI Device Name:  mt4117_pciconf0

Versions:       Current        Available
FW                14.32.1900   14.32.1010
PXE               3.6.0502       3.6.0502
UEFI             14.25.0017     14.25.0017

Status:           Up to date

Continuing to monitor and gathering statistics wrt failure rate. Current limited sample size indicates 20% success, which translates to “boot one to five times” and card should initialize and be productive Automated backup process to central storage at night functions as expected then shuts the computer down. Process repeats.