open_smd fails to initialize

Hello,

My installation involves a ConnectX-4 Lx EN card with MOFED 4.7 on CentOS 7.7 (kernel-3.10.0-1062.1.2.el7.x86_64). Updating the firmware and restarting openibd are successful, but starting the subnet manager fails (a screenshot is also attached):

sudo /etc/init.d/opensmd start

Starting opensmd (via systemctl): Job for opensmd.service failed because the control process exited with error code. See “systemctl status opensmd.service” and “journalctl -xe” for details.

[FAILED]

[root@ MLNX_OFED_LINUX-4.7-1.0.0.1-rhel7.7-x86_64]# systemctl status opensmd.service

? opensmd.service - LSB: Activates/Deactivates InfiniBand Subnet Manager

Loaded: loaded (/etc/rc.d/init.d/opensmd; bad; vendor preset: disabled)

Active: failed (Result: exit-code) since Mon 2019-10-07 17:10:03 UTC; 32s ago

Docs: man:systemd-sysv-generator(8)

Process: 200714 ExecStart=/etc/rc.d/init.d/opensmd start (code=exited, status=1/FAILURE)

Oct 07 17:09:56 systemd[1]: Starting LSB: Activates/Deactivates InfiniBand Subnet Manager…

Oct 07 17:09:56 OpenSM[200722]: /var/log/opensm.log log file opened

Oct 07 17:09:56 OpenSM[200722]: OpenSM 5.5.0.MLNX20190923.1c78385

Oct 07 17:10:03 opensmd[200714]: Starting IB Subnet Manager…[FAILED]

Oct 07 17:10:03 systemd[1]: opensmd.service: control process exited, code=exited status=1

Oct 07 17:10:03 systemd[1]: Failed to start LSB: Activates/Deactivates InfiniBand Subnet Manager.

Oct 07 17:10:03 systemd[1]: Unit opensmd.service entered failed state.

Oct 07 17:10:03 systemd[1]: opensmd.service failed.

I don’t understand if something is missing or why it’s failing and if this would affect the performance of CUDA/MPI jobs, which is the ultimate goal here.

Thanks,

Arturo

Hello Arturo,

Many thanks for posting your question on the Mellanox Community.

You are using a ConnectX-4 Lx EN Ethernet adapter. This adapter only is applicable for Ethernet fabrics and not for Infiniband fabrics.

You are trying to start a SubnetManager, which only is applicable for InfiniBand fabrics, so the process terminates itself as it cannot find a suitable InfiniBand adapter.

In this case, the driver and adapter functions as expected.

Many thanks,

~Mellanox Technical Support.

Hi Martijn,

That makes sense. Thanks.