How to generate PFC pause packets with Mlnx CX-5 and CX-6 Dx cards?

farhatullahfarhat · May 16, 2023, 1:22pm

Hi,
I am using Mlnx CX-5 and CX-6 Dx cards in my lab. I have enabled PFC with 8 TCs and configured QoS with DSCP mode. I am using DPDK pktgen to generate packets with 100 Gbps (CX-6 Dx) per PHY port. I can see the receiver card is dropping the packets by seeing rx_discard packets counter.
I am assuming the PFC should be lossless for UDP data and whenever the Rx buffer fills, it should generate PFC pause frames to stop or pause data on the sender side but I can’t see this effect.
Can someone guide me how to setup Mlnx cards to work as lossless traffic device for UDP packets.

Thanks

TomNVIDIA · May 24, 2023, 12:56am

Hello,

Welcome to the NVIDIA Developer forums. I am going to move your topic to the Networking category so the support team has visibility.

farhatullahfarhat · May 24, 2023, 7:28am

Hi, Thanks for moving to the appropriate place. I would be highly thankful for the kind reply.

Regards

TomNVIDIA · May 24, 2023, 2:13pm

Someone from the Networking team will jump in here to help ASAP.

yanivserlin · May 31, 2023, 2:47pm

Can you please provide the output of the following command from the interfaces used in the test:
mlnx_qos -i ethernet_interface
Do you have a switch in between the nodes or they are connected in back to back?
With what priority are you marking the outgoing traffic (DSCP value in the IP header)? You can use tcpdump to review if your are not sure.
Regards,
Yaniv

farhatullahfarhat · June 1, 2023, 10:26am

Thanks Yaniv,

Here are the details:

mlnx_qos -i

No switch in between. Two ConnectX-6 Dx cards are connected back to back.

Different DSCP values but one at a time. For example 0, 8, 16, 24, 32

Additional info:
OS: CentOS 8.5
Latest Ofed version

Some more information:
After reboot, when QoS configurations are done, we see the PFC pause packets. But if we make any change to QoS configurations like UP-TC mapping, DSCP to UP or change buffer sharing, then we can’t receive any PFC pause packets.

But still, if we analyze the data at the time of PFC pause packets, we are unable to stop completely the sending data. Rather I can see the sender server contineosly sends the data. When it receives the PFC pause packets, it reduces the bandwidth from 100Gbps to 24Gbps but can’t reach to 0.
As I have made changes to the Rx driver to do empty the Rx Queue, that’s why I am assuming that the data should not be lost. But I can see the Rx buffer or Queue drops the data which is not a lossless effect.

Hope this will provide you enough information to analyze what happens.

yanivserlin · June 1, 2023, 3:05pm

Hi,
In general we do not recommend touching the receive buffer on ConnectX-6 Dx devices. I can see how it might break the PFC.
I suggest you start simple with standard mlnx_qos config (in trust DSCP mdoe) and enabling one or more (NOT all) priorities with PFC. Do not change any buffer sizes and do not remap buffer and priorities. The (shared) buffer architecture in ConnectX-6 Dx should be able to handle that.
Your UP-TC mapping should be ok.
Regards,
Yaniv

farhatullahfarhat · June 2, 2023, 1:52pm

Thanks Yaniv,

I can see the same behaviour with default QoS settings. Here are the QoS configurations

Here are the ethtool stats on both servers

See ethtool statistics in below attached image.

By seeing the PFC packets, it generates PFC packets with timeout 65355 and after some time generate the same packet with timeout 0. This sequence continues.

The question I have is, why the Rx buffer drops data in PFC mode? Is there a fix timeout after which the Rx buffer become empty in the PFC mode or the data remain in the Rx buffer untill it is read by the Rx Queue?

farhatullahfarhat · June 2, 2023, 1:59pm

yanivserlin · June 4, 2023, 3:53pm

In the data you provided I do not see any drops on the RX buffer.
In general, the way it works is that when we reach some threshold we will issue an xoff pause and on a different threshold we will issue an xon pause
I suggest that you approach technical support to review further (as more debug data is required).
Regards,
Yaniv

farhatullahfarhat · June 7, 2023, 2:03pm

Thanks Yaniv,

Can you tag the technical team here or assign them this ticket or guide me how to contact them please.

Regards,
Farhat

yanivserlin · June 8, 2023, 7:10am

Please send an email to enterprisesupport@nvidia.com with the details and they will guide you through the process.
Thanks,
Yaniv

Topic		Replies	Views
How to generate PFC pause packets with Mlnx CX-5 Ethernet Adapter Cards	1	178	January 19, 2025
Mlnx_qos enables pfc, and the PFC function is automatically disabled after a period of time， Ethernet Adapter Cards	4	1358	March 8, 2023
PFC with ConnectX-5 Ethernet Adapter Cards	2	946	November 30, 2017
[PFC+CC doesn't work] Enabling PFC disables DCQCN InfiniBand/VPI Adapter Cards mlxconfig , understanding-rocev2-congestion-man , mlnx_qos	13	1207	May 20, 2024
Mlnx_qos set dcbx to firmware gives error, mlnx_qos PFC setting not persistent General Discussion mlnx_qos	2	604	October 11, 2024
SX6036 with vlans - PFC not working. Switches and Gateways	4	882	September 15, 2021
Configuring ConnectX-5 PFC without using VLAN Ethernet Adapter Cards	3	1377	August 1, 2023
ConnectX6 DPDK dpdk-testpmd Receive tcp ,udp Mixed flow performance is very low! Software And Drivers	2	1139	January 31, 2022
What is the latency of a PFC Ethernet Pause request on ConnectX-5 cards? Ethernet Adapter Cards hw , ethernet , roce , ethernet-adapter-cards	1	1281	October 4, 2023
ConnectX-6 Dx NIC Performance Issue - rx_prio0_buf_discard Metric Increase Ethernet Adapter Cards performance , dpdk	11	3585	December 19, 2024

How to generate PFC pause packets with Mlnx CX-5 and CX-6 Dx cards?

Related topics