Installation and fireware update problems

Dear All,

I am new to IB and was installing drivers for InfiniBand on a blade server and I met a severe problem with MHGH28-XTC (PSID MT_04A0110002).

  1. Install Scientific Linux X86_64 with default packages.

  2. $ lspci | grep InfiniBand

81:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX IB DDR] (rev a0)

  1. Following the installation instructions of OFED

[root@localhost MLNX_OFED_LINUX-1.5.3-3.1.0-rhel6.3-x86_64]# ./mlnxofedinstall --all

Device (81:00.0):

81:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX VPI PCIe 2.0 2.5GT/s - IB DDR / 10GigE] (rev a0)

Link Width: 8x

PCI Link Speed: 2.5Gb/s

Installation finishe@d successfully.

-E- Can not open /dev/mst/mt25418_pci_cr0: MFE_CR_ERROR

-E- Can not open /dev/mst/mt25418_pci_cr0: MFE_CR_ERROR

-E- Can not open /dev/mst/mt25418_pci_cr0: MFE_CR_ERROR

There is no firmware found for /dev/mst/mt25418_pci_cr0.

Configuring /etc/security/limits.conf.

Please reboot your system for the changes to take effect.

  1. Reboot the blade.


No IB devices found

And some other errors “corrupted device ID 0xffffffff” “HW/PCI access problem”

  1. Download and update the fireware manually.

mstflint -d 81:00.0 -i fw-25408-2_9_1000-MHGH28-XTC_A2-A3.bin burn

Success. (mlxburn -dev /dev/mst/mt25418_pci_cr0 will fail, error: device ID 0xffffffff).

  1. Reboot the blade.

lspci | grep InfiniBand

Nothing return! I can not find the hardware any more!

Please give me some advices on this problem. Many thanks for your time and help.

Best Regards,


Hi Sang,

I am not sure what happened to your card but it sound like the process misidentified it, resulting with bricking the card.

But don’t lose your hopes. i know that the folks from Mellanox support can bring back to life some of those cases.

Please open a support ticket with Mellanox support (email mailto:// or web ) and somebody will give a hands.

Good luck!

Dear yairi,

I have successfully (at least the program told me so) upgrade the fireware

and identification did exist. I do not understand why it told me "corrupted

device ID 0xffffffff" I thought because the fireware was too old to the


However, it is brick now.

The website asked for the serial number of the product and I am afraid I

can not provide it now since I am not the buyer of the blade server. I have

sent the email and hoping there will be some responses.

Thank you very much for your help.



2013/3/10 yairi < >


Mellanox Interconnect Community

< > Installation

and fireware update problems

reply from yairi< >in

InfiniBand/VPI Atadpter Cards - View the full discussion< >

Hi Sang,

Did you get the cards working?

  • Justin

Dear Justinclift,

I am afraid not yet. Since one IB HCA on the blade server is brick now, I think I should reflash the default firmware to the HCA somehow. However, I can not find a simple solution for this, such as a jumper on the HCA, OEM device by DAWNING, used for loading the HCA with the default firmware burnt on the HCA’s flash. There is no IB devices in the PCI list now. lspci | grep InfiniBand returns nothing.

I tried to install other operation systems on another node, Redhat 6 and 5. However, it failed to load the IB driver.

The firmware version 2.5.8, I do not know if I can use these IB HCA with OFED 1.5 or Redhat 6/5.

I despair of the upgrading program. One HCA lost is enough.

Best Regards



All the cards still physically in the server but the operation system does

not agree with it.

I have no idea how to fix the brick node or install OFED on the others

survivors nodes.

: (


2013/3/22 justinclift < >


Mellanox Interconnect Community

< > Re:

Installation and fireware update problems

reply from justinclift< >in

InfiniBand/VPI Atadpter Cards - View the full discussion< >

As an idea, do you have a normal PC or non-blade server around, with a free PCIe slot (PCIe x16 or PCIe x8)?

If you do, then it might be a better idea to update the firmware in your other cards with that instead of in the blade server.

I can give you the exact instructions for updating the firmware in a non-blade server (using either RHEL or CentOS versions 6.3 or 6.4). I have very similar cards here. Also MHGH28-XTC, but a different hardware revision (mine are MT_04A0120002, so different firmware needed).

After the firmware is updated, you could then see if the cards work properly in the blade server with Scientific Linux.

Btw, with the PSID for your cards, are you reading it from the sticker on the back of the card or did you get it from somewhere else? Just wondering if you might have gotten it from the wrong place, and therefore downloaded the wrong firmware. (unsure)

Hopefully this is helping.