WRN NUE47: user requested maximum #VLs is larger than supported #VLs

Dear all,

I have been trying to connect six switches using octahedron network topology. I modified the opensm.conf file while starting opensm and check if there is any error.

The opensm.conf file I entered modifies the default opensm.conf file for the following lines:

max_op_vls 8

routing_engine nue
avoid_throttled_links TRUE

nue_max_num_vls 8

qos TRUE

# QoS default options
qos_max_vls 8
qos_high_limit 6
qos_vlarb_high 0:4,1:4,2:4,3:192,4:16,5:32,6:64,7:128
qos_vlarb_low 0:64,1:64,2:64,3:64,4:64,5:64,6:64,7:64
qos_sl2vl 0,1,2,3,4,5,6,7,0,1,2,3,4,5,6,7

However, there are errors like the get_max_num_vls: WRN NUE47: user requested maximum #VLs is larger than supported #VLs exist in the log.

Am I not suppose to set max_op_vls too large?

Many thanks!!

The error is issued when the nue_max_num_vls is larger than any of the portinfo opVL across the fabric.
ibdiagnet (db_csv file) would hold the opVLs configured on the fabric ports – need to ensure those are all <=8

Have you tried reducing the number of VLs to see if it is not reproducing?

Regardless – the Nue protocol isn’t maintained by NVDA. For issues with the protocol, it is required to contact the dev.

Thanks to your response!!

Yes by reducing max_op_vls to 4 it works. We have tried that both max_op_vls 8 and max_op_vls 7 do not work.

We are running an application where the message size distribution is

| Message Sizes summary for all ranks
| Message size(B)       Volume(MB)        Volume(%)        Transfers        Time(sec)          Time(%)
                0             0.00             0.00      24399260211       4305559.54            87.02
                1           269.42             0.03        282512124        505425.38            10.21
              785            17.79             0.00            23760         17107.65             0.35
                3           292.57             0.03        102260551         15810.23             0.32
               19           715.04             0.08         39461907         14210.46             0.29
             8640            17.81             0.00             2161          5834.82             0.12

The main objective for us is to know how to tune the QoS to get a better performance.