Hi community,
I am desperately looking for a sample config of a Cumulus Switch for using Azure HCI.
especially how to use qos features and priority flow control.
Is anyone here who already did this on cumulus OS. As switch I am using a SN2100.
I am struggling covering this requirements:
Cluster traffic class
This traffic class ensures that there’s enough bandwidth reserved for cluster heartbeats:
Required: Yes
PFC-enabled: No
Recommended traffic priority: Priority 7
Recommended bandwidth reservation:
10 GbE or lower RDMA networks = 2 percent
25 GbE or higher RDMA networks = 1 percent
RDMA traffic class
This traffic class ensures that there’s enough bandwidth reserved for lossless RDMA communications by using SMB Direct:
Required: Yes
PFC-enabled: Yes
Recommended traffic priority: Priority 3 or 4
Recommended bandwidth reservation: 50 percent
Default traffic class
This traffic class carries all other traffic not defined in the cluster or RDMA traffic classes, including VM traffic and management traffic:
Required: By default (no configuration necessary on the host)
Flow control (PFC)-enabled: No
Recommended traffic class: By default (Priority 0)
Recommended bandwidth reservation: By default (no host configuration required)
as a noobie to nvidia switches and cumulus I appreciate your tips :)
This single RoCE command looks to set everything but the Cluster Traffic class group (which is priority 6 in CL by default) from this point you could either change the host to use prio 6 for this and be good to go – or – change the switch to address this in the host default in prio 7. For this you’ll have to dig a bit deeper into the QoS configurations in CL.
See the docs here → Quality of Service | Cumulus Linux 5.6
I’ll check with the TME team to see if they have any configurations to recommend here specifically.
@epulvino@Lumos
Hi
Sorry, but any update on this as I agree with Lumos - everything is set beside the PCP 7 or DSCP
56,57,58,59,60,61,62,63 ( I’m not sure what the marker is match to when the packet arrives)
How do I change using NVUE the default group “0” to 49%, 50% for PCP 3 and then 1% to PCP 7 - I recall it has to be in the accurate order as to prevent >100% ?