MLNX OpenSM force sweeping every minute

waleed.khalid · August 19, 2024, 8:26am

Hi all,
We have a network on unmanaged switches ranging from NDR to EDR, with MLNX OpenSM being the subnet manager running on an EDR node connected to an EDR switch. Currently in opensm logs we see

******************************************************************
*********************** HEAVY SWEEP START ************************
******************************************************************


Aug 19 10:24:50 910089 [2A2A76C0] 0x02 -> do_sweep: Entering heavy sweep with flags: force_heavy_sweep 1, coming out of standby 0, subnet initialization error 0, sm port change 0
Aug 19 10:24:50 940459 [2A2A76C0] 0x02 -> updn_lid_matrices: disabling UPDN algorithm, no root nodes were found
Aug 19 10:24:50 940501 [2A2A76C0] 0x01 -> ucast_mgr_route: ar_updn: cannot build lid matrices.
Aug 19 10:24:50 951336 [2A2A76C0] 0x02 -> osm_ucast_mgr_process: minhop tables configured on all switches
Aug 19 10:24:51 019910 [2A2A76C0] 0x02 -> SUBNET UP
Aug 19 10:24:51 389593 [709326C0] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:101 TID:0x0000388b00000080
Aug 19 10:24:52 493189 [639186C0] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:101 TID:0x0000388c00000080
Aug 19 10:24:53 596876 [7693E6C0] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:101 TID:0x0000388d00000080
Aug 19 10:24:54 700521 [699246C0] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:101 TID:0x0000388e00000080
Aug 19 10:24:57 297814 [6B9286C0] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:101 TID:0x0000388f00000080
Aug 19 10:24:58 401113 [6F9306C0] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:101 TID:0x0000389000000080
Aug 19 10:24:59 569785 [689226C0] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:101 TID:0x0000389100000080
Aug 19 10:25:01 025736 [2A2A76C0] 0x02 -> do_sweep:

This is constantly going on and as a result if the node goes down, our infiniband network crashes, without the backup opensm coming up. LID-101 is one of our NDR switches, the details of which are:

mlxlink -d lid-101

Operational Info
----------------
State                              : Active 
Physical state                     : LinkUp 
Speed                              : IB-NDR 
Width                              : 4x 
FEC                                : Ethernet_Consortium_LL_50G_RS_FEC_PLR -(272,257+1) 
Loopback Mode                      : No Loopback 
Auto Negotiation                   : ON 

Supported Info
--------------
Enabled Link Speed                 : 0x000000f1 (NDR,HDR,EDR,FDR,SDR) 
Supported Cable Speed              : 0x000000f1 (NDR,HDR,EDR,FDR,SDR) 

Troubleshooting Info
--------------------
Status Opcode                      : 0 
Group Opcode                       : N/A 
Recommendation                     : No issue was observed 

Tool Information
----------------
Firmware Version                   : 31.2012.4036 
amBER Version                      : 3.2 
MFT Version                        : mft 4.28.0-92

Can someone help me troubleshoot this? As this is causing quite a few issues within our cluster.

marlon1 · August 21, 2024, 9:02am

Hi,

Based on the output of the opensm logs it looks like there is a flapping port.
You can run the below a few times to see switch port has the incrementing LinkDown counter.

for t in {1…<range_#>}; do echo $t;perfquery $t;done | grep LinkDown

waleed.khalid · August 21, 2024, 9:33am

Thank you the issue was indeed a faulty connection, we managed to figure out which port it was coming from. However, the secondary question would remain that in this case does the opensm have to do a heavysweep all the time? Can this be avoided? Perhaps with a static opensm config? Or will this always happen and we need to keep a constant eye on the logs?

marlon1 · August 21, 2024, 9:43am

Heavy sweep happens every time there is a change related to the topology in the fabric.
Meaning if there is a link flap there will be a heavy sweep.

Although, it is not recommended, there is the below option in the opensm.conf file to disable it.

SWEEP OPTIONS

The number of seconds between subnet sweeps (0 disables it)

sweep_interval 10

You can also change the below to “FALSE”.

If TRUE, every trap 128 and 144 will cause a heavy sweep.

NOTE: successive identical traps (>10) are suppressed

sweep_on_trap TRUE

waleed.khalid · August 21, 2024, 10:05am

Dear Marlon,
Thank you for the reply, where is the opensm.conf file generated? As in is it automatically generated or would we have to generate it ourselves?

ETA: Ah figured it out, we have to create one. I believe having the sweep interval set to 10 should be enough.
So I am assuming without an opensm.conf the sweep interval is set to infinity? Lastly, I am assuming we will have to restart the service to launch it with the new configuration correct? I am asking as when this flap happened, we definitely experienced more than 10 sweeps (we had not generated a config file but were just running it out of the box as a daemon). It was to a limit that it crashed the node it was running on, taking the network with it. We had to restart the node to fix the issue.

zhangsuo · August 29, 2024, 4:38am

You can generate the opensm configuration file by opensm -c /etc/opensm/opensm.conf
And load the configruation file by opensm -f /etc/opensm/opensm.conf
After change the configuration file, you can reload the file by #pkill -HUP opensm
Thanks，
Suo

waleed.khalid · August 29, 2024, 1:09pm

Thank you, but does this mean that if we run it without a configuration file, then some “Default” values are not loaded correctly? Also in our case since we want to run opensm as a daemon, we do this as

/etc/init.d/opensmd -f /etc/opensm/opensm.conf

?

zhangsuo · August 30, 2024, 9:46am

No， if you don’t use any configuration file. You will use default configuration.
Which is the same configuration file as generated by <opensm -c opensm.conf>
Thanks,
Suo

waleed.khalid · August 30, 2024, 9:57am

So it was indeed an unintended consequence or a bug that our opensm was doing a heavy sweep at least 3-5 heavy sweeps within 10 second intervals because the cable was faulty. We will set up monitoring for it in that case. Thank you for the help.

Topic		Replies	Views
Understanding Mellanox OpenSM config Mellanox OFED opensm	2	163	September 2, 2024
Issue with opensm and NDR Speed Mellanox OFED	3	131	November 19, 2024
Dell M1000e blade server, InfiniBand QDR subnet issue, OFED 4.4, opensm initialization error! InfiniBand/VPI Switch Systems infiniband , port , opensm , winof-driver , ifconfig	22	1526	December 12, 2018
Switch boot problem InfiniBand/VPI Switch Systems switches , 1-dev	15	2269	December 10, 2018
Can't get opensm to work Mellanox OFED	3	677	May 16, 2018
message "SUBNET UP" is not found in log files Mellanox OFED	1	511	November 1, 2014
OpenSM discovering same port over and over Mellanox OFED	4	858	August 2, 2023
New to infiniband, can't get a working connection.	22	2086	September 9, 2013
Odd, unsymmetric ib_send_lat results? Mellanox OFED iterations , bytes , uname	14	835	December 11, 2018
XavierNX EQOS LAN port sometimes doesn't link up Jetson Xavier NX board-design , nvbugs , ethernet	46	4938	August 16, 2021