OpenSM Service NOT Installing - Windows Server 2016 DC - HELP!

I’m a Mellanox, networking and Infiniband newb, so please bear with me. I spent the last two days scouring this site and the Interweb for answers, but could not find them.

This is an unusual case, as this is not the standard installation. We’re running a Dell PE R730xd, 20 cores, 256GB, H730p for internal 24x SSD and an Areca 1883x for a 44-drive 352TB JBOD, all currently on Windows Server 2016 Datacenter (though we’re not using any DC features).

I’ve run both the Beta and current WinOF2 5.25 packages multiple times, being sure OpenSM is selected, and the install seems to go well – drivers are there, cards are detected and displaying in NW properties, but OpenSM is nowhere to be found in Services – it’s just not there… anywhere.

We’re running TWO Mellanox MCX353A-FCBT single-port 40/56 IPoIB cards in the server, WITHOUT a switch – i.e. DIRECT-connecting to two Windows 7 Pro desktop workstations, each with their own MCX353-FCBT cards that are installed and running, one direct-connected to each MCX353A in the server.

Why, you may ask?

Each PC is a scan controller for two very high-end motion picture film scanners ($500K per) that push data at INCREDIBLE rates – ~30Gb/s PER scanner (5K resolution TIFF at 30 FPS, each frame is ~80MB, times TWO concurrently), so we’ve built our own storage that can ingest 3-4 of those streams concurrently and decided on these Mellanox boards for some unknown reason.

The cabling seems to be working fine, we have green lights on all interconnected cards. It’s QSFP+ 5M AOC.

Everything LOOKS like it should/will work fine, but obviously not without OpenSM. Is there any way to install the OpenSM service independently of the WinOF package? Is there a setting somewhere we’re missing? Is it 2016 Datacenter that is causing the problem?

Ultimately we intend to run multiple iSCSI volumes for each scanner to push to independently / concurrently. Each scanner will have it’s own port on the server. We need this operational ASAP, because we’re shipping this half-rack to InterBEE in Tokyo this Friday. (Yaaaaay!)

Thanks!

Hi Paul,

Not sure if you tried this already but:

http://www.mellanox.com/related-docs/prod_software/MLNX_VPI_WinOF_User_Manual_v5.25.pdf → section 3.2.2 http://www.mellanox.com/related-docs/prod_software/MLNX_VPI_WinOF_User_Manual_v5.25.pdf

Also - want to get your attention to the below in the Release notes http://www.mellanox.com/related-docs/prod_software/MLNX_VPI_WinOF_Release_Notes_Rev_5.25.pdf :

• Utilities: • OpenSM: InfiniBand Subnet Manager is provided as a sample code. The sample code is intended to allow users to test or bring-up the InfiniBand fabric without a management console / switch (to get started). For cluster production environments, Mellanox’s recommendation is to use a Managed Switch or the UFM-SDN Appliance.

Also, I was under the impression that the MCX353A-FCBT will run in ETHERNET mode, bypassing any requirement for Infiniband management – is that true? If so, I cannot see any way in the driver to select WHICH mode (IB/EN) to run in. Am I missing something, or am I misinformed?

I presume that would be the easiest solution.

Hi Eddie!

Thanks! I suppose it was too much for me to RTFM ;-)

At least it’s not as obtuse as the Mellanox Windows firmware update procedure. Who runs their software department? Rube Goldberg’s evil twin?!

That said, while we were able to get the Service registered, we are still NOT able to get it started (manually or automatically). We’re getting a “Error 1053” timeout.

How do we procure one of the recommended (in the manual) versions of OpenSM? (FabricIT EFM, UTM, or MLNX-OS)

Thanks for the help, it is GREATLY appreciated.