ConnectX-6 Dx: rx_discards_phy limits XDP redirect to ~100Mpps at 64-byte line rate (148Mpps)

xdp_user · March 25, 2026, 5:28pm

Environment

NIC: ConnectX-6 Dx Dual Port 100GbE (0F6FXM_08P2T2_Ax)
Firmware: 22.46.3048
OFED: 23.10-3.2.2
CPU: AMD EPYC 9754 128-Core (single socket, 1 NUMA node)
Kernel: 6.1.0-43-amd64 (Debian)
PCIe: Gen4 x16, MaxPayload 512, MaxReadReq 4096

Problem

We are running an XDP program that redirects all packets from between interfaces (separate NICs on separate PCIe slots). Traffic generator sends 64-byte packets at 100G line rate (~148 Mpps).

The NIC receives all packets at the wire (rx_packets_phy = 148M/s) but only delivers ~80-100M pps to the host. The rest are dropped by the NIC hardware - rx_discards_phy increments at 45-68M/s.

Some more information:

rx_out_of_buffer = 0
CPU utilization is only 7%
Zero TX errors on the egress NIC (tx_xdp_full = 0, tx_xdp_err = 0)

We benchmarked different combined channel counts while keeping everything else constant:

Queue Count Scaling (key finding):

16 queues → 42.5 Mpps TX, 48.9 Mpps discards, 28.6% forwarded
32 queues → 81.5 Mpps TX, 32.3 Mpps discards, 54.8% forwarded
48 queues → 100.4 Mpps TX, 42.6 Mpps discards, 67.5% forwarded
64 queues → 93.9 Mpps TX, 54.2 Mpps discards, 63.1% forwarded
96 queues → 85.3 Mpps TX, 63.6 Mpps discards, 57.3% forwarded
127 queues → 81.0 Mpps TX, 67.9 Mpps discards, 54.4% forwarded

Performance peaks at 48 queues and degrades with more.

What we’ve tried (no significant improvement)

Interrupt coalescing: Tested adaptive on/off, rx-usecs 3-128, rx-frames 32-512 — no change
NAPI tuning: napi_defer_hard_irqs up to 50, gro_flush_timeout up to 200µs — no change at 127 queues
CQE compression: CQE_COMPRESSION=AGGRESSIVE (firmware) + rx_cqe_compress on (driver) — marginal improvement
PCIe relaxed ordering: PCI_WR_ORDERING=force_relax — no change
Virtual lanes: NUM_OF_VL_P1=1 (reduced from 4) — no change
MaxReadReq: Increased to 4096 — no change
Driver private flags: tx_cqe_compress on, xdp_tx_mpwqe on, tx-push on — ~2% improvement

Questions

Is ~100 Mpps the expected maximum XDP redirect throughput for ConnectX-6 Dx with 64-byte packets? What is the NIC’s rated small-packet forwarding capacity?
What does rx_discards_phy incrementing with rx_out_of_buffer=0 indicate? Is this an internal port buffer overflow or a scheduling/arbitration limit?
Are there firmware parameters or NIC configuration options we haven’t explored that could increase the packet delivery rate?
Would upgrading to OFED 24.10 or newer firmware improve small-packet XDP performance?
Is ConnectX-7 expected to have a higher internal pps ceiling?

Any guidance on maximizing small-packet XDP redirect throughput would be greatly appreciated.

jsl2 · April 2, 2026, 3:47am

At 64-byte packets and 100G, you are pushing ~148 Mpps at the RX port. The NIC sees all of them on the wire (rx_packets_phy), but only ~80–100 Mpps can pass through the host-based XDP redirect path; the rest are dropped in hardware and counted in rx_discards_phy.

Per DOCA Telemetry, rx_discards_phy are packets dropped on the physical port due to lack of buffers (adapter congestion), and this is independent of rx_out_of_buffer (host RX WQE exhaustion). With rx_out_of_buffer=0, your RX rings are sized fine—the drop is earlier, because the ingress and host-facing pipeline are saturated in packets-per-second terms.

1/ Expected PPS / rating:

There is no published small-packet XDP redirect PPS guarantee for ConnectX-6 Dx. The ~80–100 Mpps you see for 64-byte XDP redirect is in line with what we expect from the host-based path at 100G; achieving the full ~148 Mpps at 64B typically requires a hardware offload path (e.g. eSwitch/ASAP²), not pure XDP redirect.

2/ Meaning of rx_discards_phy with rx_out_of_buffer=0:

This indicates congestion at the adapter’s physical port (lack of buffers) rather than a shortage of RX WQEs on the host queues. In other words, the adapter is oversubscribed by the 148 Mpps stream relative to what it can move to the host/XDP pipeline.

3/ Firmware / NIC tuning:

The main NIC-side XDP optimizations are features like XDP inline transmission of small packets and multi-packet TX WQEs, which you already have enabled via xdp_tx_mpwqe, CQE compression, etc. There are no documented firmware “hidden knobs” that raise an internal PPS limit beyond that; further gains come from CPU affinity, NAPI budgeting, and queue layout and are usually incremental.

4/ OFED 24.10 / newer firmware:

Running the latest MLNX_OFED/MLNX_EN and firmware is recommended and can provide incremental XDP improvements, but based on current public documentation we do not expect it alone to bridge the entire gap from ~100 Mpps to full 148 Mpps 64-byte redirect.

5/ ConnectX-7:

CX7 generally offers higher performance and more headroom, but there is no official XDP redirect PPS specification for it either. Host-based XDP redirect remains constrained by the host I/O and processing pipeline, so CX7 may help, but it is not a guaranteed way to reach line-rate 148 Mpps for this workload.

Topic		Replies	Views
Debug rx_discards_phy on ConnectX-6 Mellanox OFED	0	451	October 6, 2024
rx_discards_phy Adapters and Cables problem , ethtool , rx	2	3148	February 14, 2021
ConnectX6 DPDK dpdk-testpmd Receive tcp ,udp Mixed flow performance is very low! Software And Drivers	2	1116	January 31, 2022
ConnectX-6 Dx NIC Performance Issue - rx_prio0_buf_discard Metric Increase Ethernet Adapter Cards performance , dpdk	11	3520	December 19, 2024
ConnectX6DX - rte-flow / RSS performance drop on mixed traffic Ethernet Adapter Cards dpdk , flow-steering , rss-debugging	0	1530	May 31, 2022
RX Out-of-Buffer Issues on Mellanox NIC with High-Rate Packet Generator Traffic Ethernet Adapter Cards	3	644	February 20, 2026
Jetson AGX , ConnextX-6 Dx, DPDK performance issue Jetson AGX Xavier ethernet	3	620	June 26, 2023
I use the AF_XDP and the example of the linux kernel (xdpsock) and I reach a maximum at 1Mpps in Tx and 5Mpps in Rx (and zero-copy is not activable when it should be I think), where I would be expecting 15/20Mpp Ethernet Adapter Cards	1	460	January 21, 2020
What is causing rx_discards_phy to occasionally increase? Software And Drivers ethernet , infiniband , qp , rx	4	3093	November 11, 2019
ConnectX-4 RX performance issues on DPDK Application Accelerator Software tx , rx	2	1449	March 25, 2021

ConnectX-6 Dx: rx_discards_phy limits XDP redirect to ~100Mpps at 64-byte line rate (148Mpps)

Related topics