Hi
Here are the steps to resolve this issue:
1. Check if the Device is Recognized by the System
First, verify the card is detected at the PCI level:
lspci | grep -i mellanox
# or
lspci | grep -i nvidi
2. Check Driver Status
Ensure the Mellanox drivers are loaded:
lsmod | grep mlx
If no drivers are loaded, try loading them:
modprobe mlx5_core
3. Try Using mst (Mellanox Software Tools)
Start the MST service and query the device:
mst start
mst status
Then try:
flint -d /dev/mst/mt<device_id>_pciconf0 query
4. Check for Physical Issues
-
Ensure the card is properly seated in the PCIe slot
-
Try a different PCIe slot (preferably x16)
-
Check for any visible damage on the card
-
Verify adequate power supply to the card
If none of the above works, the card may have a hardware failure or severely corrupted firmware requiring RMA. You should contact NVIDIA support with:
-
Output of lspci -vvv -s 01:00.0
-
Output of dmesg | grep -i mlx
-
The card’s serial number
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.
