MLNX OpenSM force sweeping every minute

Hi all,
We have a network on unmanaged switches ranging from NDR to EDR, with MLNX OpenSM being the subnet manager running on an EDR node connected to an EDR switch. Currently in opensm logs we see

******************************************************************
*********************** HEAVY SWEEP START ************************
******************************************************************


Aug 19 10:24:50 910089 [2A2A76C0] 0x02 -> do_sweep: Entering heavy sweep with flags: force_heavy_sweep 1, coming out of standby 0, subnet initialization error 0, sm port change 0
Aug 19 10:24:50 940459 [2A2A76C0] 0x02 -> updn_lid_matrices: disabling UPDN algorithm, no root nodes were found
Aug 19 10:24:50 940501 [2A2A76C0] 0x01 -> ucast_mgr_route: ar_updn: cannot build lid matrices.
Aug 19 10:24:50 951336 [2A2A76C0] 0x02 -> osm_ucast_mgr_process: minhop tables configured on all switches
Aug 19 10:24:51 019910 [2A2A76C0] 0x02 -> SUBNET UP
Aug 19 10:24:51 389593 [709326C0] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:101 TID:0x0000388b00000080
Aug 19 10:24:52 493189 [639186C0] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:101 TID:0x0000388c00000080
Aug 19 10:24:53 596876 [7693E6C0] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:101 TID:0x0000388d00000080
Aug 19 10:24:54 700521 [699246C0] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:101 TID:0x0000388e00000080
Aug 19 10:24:57 297814 [6B9286C0] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:101 TID:0x0000388f00000080
Aug 19 10:24:58 401113 [6F9306C0] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:101 TID:0x0000389000000080
Aug 19 10:24:59 569785 [689226C0] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:101 TID:0x0000389100000080
Aug 19 10:25:01 025736 [2A2A76C0] 0x02 -> do_sweep:


This is constantly going on and as a result if the node goes down, our infiniband network crashes, without the backup opensm coming up. LID-101 is one of our NDR switches, the details of which are:

mlxlink -d lid-101

Operational Info
----------------
State                              : Active 
Physical state                     : LinkUp 
Speed                              : IB-NDR 
Width                              : 4x 
FEC                                : Ethernet_Consortium_LL_50G_RS_FEC_PLR -(272,257+1) 
Loopback Mode                      : No Loopback 
Auto Negotiation                   : ON 

Supported Info
--------------
Enabled Link Speed                 : 0x000000f1 (NDR,HDR,EDR,FDR,SDR) 
Supported Cable Speed              : 0x000000f1 (NDR,HDR,EDR,FDR,SDR) 

Troubleshooting Info
--------------------
Status Opcode                      : 0 
Group Opcode                       : N/A 
Recommendation                     : No issue was observed 

Tool Information
----------------
Firmware Version                   : 31.2012.4036 
amBER Version                      : 3.2 
MFT Version                        : mft 4.28.0-92 

Can someone help me troubleshoot this? As this is causing quite a few issues within our cluster.

Hi,

Based on the output of the opensm logs it looks like there is a flapping port.
You can run the below a few times to see switch port has the incrementing LinkDown counter.

for t in {1ā€¦<range_#>}; do echo $t;perfquery $t;done | grep LinkDown

1 Like

Thank you the issue was indeed a faulty connection, we managed to figure out which port it was coming from. However, the secondary question would remain that in this case does the opensm have to do a heavysweep all the time? Can this be avoided? Perhaps with a static opensm config? Or will this always happen and we need to keep a constant eye on the logs?

Heavy sweep happens every time there is a change related to the topology in the fabric.
Meaning if there is a link flap there will be a heavy sweep.

Although, it is not recommended, there is the below option in the opensm.conf file to disable it.

SWEEP OPTIONS

The number of seconds between subnet sweeps (0 disables it)

sweep_interval 10

You can also change the below to ā€œFALSEā€.

If TRUE, every trap 128 and 144 will cause a heavy sweep.

NOTE: successive identical traps (>10) are suppressed

sweep_on_trap TRUE

Dear Marlon,
Thank you for the reply, where is the opensm.conf file generated? As in is it automatically generated or would we have to generate it ourselves?

ETA: Ah figured it out, we have to create one. I believe having the sweep interval set to 10 should be enough.
So I am assuming without an opensm.conf the sweep interval is set to infinity? Lastly, I am assuming we will have to restart the service to launch it with the new configuration correct? I am asking as when this flap happened, we definitely experienced more than 10 sweeps (we had not generated a config file but were just running it out of the box as a daemon). It was to a limit that it crashed the node it was running on, taking the network with it. We had to restart the node to fix the issue.

You can generate the opensm configuration file by opensm -c /etc/opensm/opensm.conf
And load the configruation file by opensm -f /etc/opensm/opensm.conf
After change the configuration file, you can reload the file by #pkill -HUP opensm
Thanksļ¼Œ
Suo

Thank you, but does this mean that if we run it without a configuration file, then some ā€œDefaultā€ values are not loaded correctly? Also in our case since we want to run opensm as a daemon, we do this as

/etc/init.d/opensmd -f /etc/opensm/opensm.conf

?

Noļ¼Œ if you donā€™t use any configuration file. You will use default configuration.
Which is the same configuration file as generated by <opensm -c opensm.conf>
Thanks,
Suo

So it was indeed an unintended consequence or a bug that our opensm was doing a heavy sweep at least 3-5 heavy sweeps within 10 second intervals because the cable was faulty. We will set up monitoring for it in that case. Thank you for the help.