Mlx4eth63.sys crashes with BSOD on AlderLake-CPU system

Got my hands down on 2 Mellanox ConnectX-3 MCX354A cards. One works fine in a EPYC7302 system running Windows Server 2019, but the second one BSODs the PC with i5 13600K running Windows 10 when i try to change Port Protocols in the Device Manager. Is there anything i can do to get rid of BSODs?

Here’s the minidump and Info on the Mellanox card:
031324-36156-01.zip (299.9 KB)
image

Try disabling HyperThreading on BIOS menu.
The same problem on my PC solved by disabling HT.

Well, disabling E-cores completely works too, but that’s not a good solution. Something is definitely wrong with the windows version of that driver. Some forum threads (on ChipHell) suggest that it’s some weird NUMA awareness issue, since 8-cored CPUs like 13700K/12900K worked ok with that same driver. NIC works ok in InfiniBand mode, at least.

Hello DROZD02,

Thank you for posting your inquiry on the NVIDIA Developer Forum - Infrastructure and Networking - Section.

Based on the screenshot, the firmware on the adapter is not aligned with the driver version.

We would recommend to download the latest WinOF driver and the latest available firmware version for the ConnectX-3 adapter you are using → Firmware for ConnectX®-3 IB/VPI

Be aware that this adapter is EOL and EOS for a while, and we recommend to update the adapter to a more recent model, for example ConnectX-6 or ConnectX-7

Thank you and regards,
~NVIDIA Networking Technical Support

Hello, MvB!
Since the original post i already updated firmware on both cards. Nothing changed.
Currently the card in PC looks like this:
image

Updating firmware did not resolve the issue.

@ DROZD02 Make sure your operating system is fully updated with the latest patches, bug fixes, and security updates. Sometimes, BSOD issues can be resolved by installing the latest updates from Microsoft.

Windows was and still is updated to the latest available version:
image
No new updates pending.
As i said before - issue occurs only if i try to interact with anything related to Ethernet on the Mellanox card, which in turn accesses faulty mlx4eth63.sys driver, the one that is (afaik) supplied with WinOF package.
Driver package for that card and that same processor, but for linux also currently works. Only the one for Windows doesn’t.

Just ran into the same Problem. thx for your Solution!

Well, no actual solution came up here in the end. Disabling ethernet is merely a workaround. Basically - just get a newer mellanox card, at the very least ConnectX-4.