Mellanox Voltaire 4036 VLT-3011s way to reflash OS / reset password

Hi,

I bought two Mellanox Voltaire 4036 VLT-3011 switches from EBAY.

These seem both to have issues. I need help diagnosing and fixing these if possible.

I have serial console cable and it works with both units. I’m able to get to the

login prompts and see U-boot options and access U-boot boot menu,

But I can’t get pass the login prompts in both units.

The OS versions are below:

  1. 4036 Version 3.9.1 BUILD ID 985

  2. 4036 Version 3.6.2 BUILD ID 872

1- The first (1) unit, does not let me log on as admin/123456 or root/br6000 or guest/voltaire

It accepts the admin and guest logons but immediately after login prompt it logs the user out,

and I do not get to the OS level at all.

I found during the boot process the following warnings:

Empty flash at 0x02eb8908 ends at 0x02eb9000:

Empty flash at 0x0613c8e8 ends at 0x0613d000

I get immediately the following warning:

ibwarn: [896] mad_rpc_open_port: can’t open UMAD port ((null):0)

and user is logged out, and getty terminal is reset

Does this mean there is no OS on the flash as its empty and I can not login ??

Can this switch even be used as IB switch?

This unit lights up all the switch port leds and they are on all the time !!

  1. The second (2) unit does not accept any admin/guest/root password and complains

about invalid login/password.

I have tried reset command from U-boot menu and also pressed the reset switch on the Voltaire 4036

for 30 seconds and the switches reboot but do not reset the password on this unit.

This unit does not light up the switch port leds. But they blink once during the reboot.

My questions:

  • Can I reset the password somehow on the 2) unit that I can not login into ?

  • Can I reload OS on the 1) unit which complains about Empty Flash and I cant login as admin ?

  • Can these switches be used as IB switches but without management features ?

  • Can the SM tools spark etc. be used to reflash firmware remotely ?

@yairi Infrastructure & Networking - NVIDIA Developer Forums and justinclift Infrastructure & Networking - NVIDIA Developer Forums been helping others with upgrade/flashing/reset procedures for 4036 and isr 9024d-m

Can you @yairi Infrastructure & Networking - NVIDIA Developer Forums and justinclift Infrastructure & Networking - NVIDIA Developer Forums help me here to analyse the situation with these 2 4036 switches

and what can I do to gain access and start using the switches or are they dead bricks?

Also inbusiness Infrastructure & Networking - NVIDIA Developer Forums might be interested to help

As there is U-boot, I’ve seen TFTP option and flash_NFS

  • tftpboot- boot image via network using TFTP protocol

  • Type run flash_nfs to mount root filesystem over NFS

Can I use above to load image via TFTP on U-Boot or flash_nfs for root filesystem ?

I haven’t found any instructions or anyone succeeding in doing this. Can we try to work this out ?

I’ve read these:

Upgrading 4036, media full Infrastructure & Networking - NVIDIA Developer Forums

voltaire 4036 cant connect to CLI Infrastructure & Networking - NVIDIA Developer Forums

Mellanox (old Voltaire) ISR9024D-M recover flash area Infrastructure & Networking - NVIDIA Developer Forums

Here somebody has tried TFTP / NFS but fails:

Voltaire/Mellanox 4036 firmware update | ServeTheHome Forums Voltaire/Mellanox 4036 firmware update | ServeTheHome Forums

I’ve attached some logfiles from the terminal sessions from both units as well as the uboot prompt menu commands.

Thanks for any suggestions

1boot.log.zip (4.77 KB)

2-boot.log.zip (3.84 KB)

uboot.log.zip (1.29 KB)

Sorry Yair, I unsubscribed from anything to do with Mellanox after the last one, so don’t have a more recent example.

No worries. i am still working with my team on that.

Thanks!

I am not sure what this red LED is. don’t think i can tell. things could be blinking inside but it’s been a while since last i opened such box.

as for the PSU - as far as i remember, it has few LEDs. what does the label next to the LED says?

Working on it with my social media people.

Can you send me an example for a more recent one?

Personally, I no longer help out on these forums nor advise people to buy Mellanox gear, due to their phishing-like emails (which they’ve been warned about).

Good luck with your problem though.

:-( We need you Justin!

yairi Infrastructure & Networking - NVIDIA Developer Forums

I tried the password reset function uboot menu and it worked, but the first unit logs

me out of the console after password change and even if I try to relogin it kills the login process

so something is wrong with the OS ?

the second unit it now lets me logon to the console an it seems to be working properly.

So thanks for the help so far.

Is there any way for me to reinstall the OS on the first unit ?

Here’s a log of what is happening on the first unit during password change and logon:

4036 Version 3.9.1 BUILD ID 985

Sun Sep 23 18:01:19 IST 2012

4036-28C4 login: admin

Password:

Welcome to Voltaire Switch 4036-28C4

Initial configuration: Please change the default root password

Changing password for root

Enter the new password (minimum of 5, maximum of 8 characters)

Please use a combination of upper and lower case letters and numbers.

Enter new password:

Re-enter new password:

Password changed.

ibwarn: [848] mad_rpc_open_port: can’t open UMAD port ((null):0)

process '-/sbind

4036 Version 3.9.1 BUILD ID 985

Sun Sep 23 18:01:19 IST 2012

4036-28C4 login:

Also during kernel load I saw these messages on first unit:

Loading module Voltaire

Empty flash at 0x02eb8908 ends at 0x02eb9000

Empty flash at 0x0613c8e8 ends at 0x0613d000

Empty flash at 0x08d1d504 ends at 0x08d1d800

These are not present in the working (second) unit.

Let me know if/when the phishing-like emails have stopped. Until then, no.

Thanks for your help.

I returned the defective item.

They are offering another one in exchange or full refund.

However, there is a red led inside the case of the other unit blinking/turned on.

Is that alarming ? do you know ? Is it in error condition ?

Or just a power/status led or is it showing that no network connectivity?

Can I take it ? They can only do power on testing, and not any other kind of tests.

Otherwise only the PSU led is turned on this another unit.

Hi Joe,

I wouldn’t get my hopes high on the first unit.

I might be wrong but this is what I am thinking: the fact that all LEDs are on is probably because the switch ASIC is stack in reset.

the UMAD port 0 error means that the OS can’t send management packets to the ASIC.

the empty flash errors may be related to the ASIC flash area. this is where the switch ASIC stores the FW. maybe this part is bad.

all together, i think you have a bad switch here. i am not sure if you will be able to recover it beyond what we tried.

Sorry i can’t help further more. hope you enjoy you other switch unit :-)

justinclift Infrastructure & Networking - NVIDIA Developer Forums I hope you would change your mind this time. Please see your inbox, I have offered to pay

for your help.

Hi,

generally speaking, these switches can do the IB native switching operation even if the SW is in bad state. unless their is something wrong with the switch ASIC or the software/HW (PCI) keeps on resetting the ASIC.

About your “first switch”, the one with all lights turned on - to me it looks like it is stack in some sort of reset sequence. what we want to try is the following:

  • Open the box and check if everything is connected. no loos components, wires, etc. if this is something you haven’t done before, you might want to call somebody up to do it with you.
  • set it back to factory defaults: since you have no access to CLI or OS, there is a way to do it from the uboot menu.
    • connect a console cable and power up the switch and immidiately after it starts, stop the boot sequence (^c or any key)

    • the uboot prompt should look like this “=>”

    • run the following commands:

      • imm 53 90.1

      • 01

      • enter

      • ctr+c

      • boot

    • wait until the boot sequence is complete and you received back the login prompt. login with default user/pass (those you mentioned are fine).

Let’s hope that the HW is in good shape and this is helping. other then that, i don’t think there is anything else that can be done.

About your “second switch”: The LED sequence is correct so it is a good sign. somebody probably set it up with a password. go through the same procedure to factory reset it as above and start with a brand new configuration. later on i suggest upgrading to the latest version 3.9.1.

Good luck.

First unit - I checked inside of the unit and there is no loose wires or anything.

All the port leds are lit on the system and I get the Empty Flash during reboot process

and when I try to login it logs me out as it cant run the shell and I get a warning:

yairi Infrastructure & Networking - NVIDIA Developer Forums

I tried the password reset function uboot menu and it worked, but the first unit logs

me out of the console after password change and even if I try to relogin it kills the login process

so something is wrong with the OS ?

the second unit it now lets me logon to the console an it seems to be working properly.

So thanks for the help so far.

Is there any way for me to reinstall the OS on the first unit ?

Here’s a log of what is happening on the first unit during password change and logon:

ibwarn: [848] mad_rpc_open_port: can’t open UMAD port ((null):0)

process '-/sbind