Opensm was dead

  1. What happens to the Innifiband Network when opensm dies?
  2. I cannot find the server opensm was running or is running.
    As uptime is initialized, it seems that reboot of all system was done, but I can’t find any past history of opensm execution in the syslog or message log.
    If I running opensm on a different server than the one it was running on, is there any problem?
    Problems such as momentary disconnection.

Hello jyh,

Thank you for posting your inquiry on the NVIDIA Developer Forum - Infrastructure and Networking - Section.

To answer your questions:

  • When there is no OpenSM running in the fabric, the HCA’s will not get a LID assigned and there for the logical link stays down. No network traffic is possible in this state. The OpenSM in the fabric is the manager, who assigns the unique LIDs to every device connected to the fabric (switches and HCAs) and who, based on the configuration creates the end-to-end routes between the LIDs
  • You can use the ‘sminfo’ command in the fabric to determine which node or switch is the master OpenSM
  • You can run the OpenSM on a different node as it will run it based on the configured priority. You just need to make sure the opensm.conf and all the needed options are present on the other node.

For more information, please review the following section of the MLNX_OFED driver UM → https://docs.nvidia.com/networking/display/MLNXOFEDv561033/OpenSM

Thank you and regards,
~NVIDIA Networking Technical Support

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.