Environment
- NIC: ConnectX-6 Dx Dual Port 100GbE (0F6FXM_08P2T2_Ax)
- Firmware: 22.46.3048
- OFED: 23.10-3.2.2
- CPU: AMD EPYC 9754 128-Core (single socket, 1 NUMA node)
- Kernel: 6.1.0-43-amd64 (Debian)
- PCIe: Gen4 x16, MaxPayload 512, MaxReadReq 4096
Problem
We are running an XDP program that redirects all packets from between interfaces (separate NICs on separate PCIe slots). Traffic generator sends 64-byte packets at 100G line rate (~148 Mpps).
The NIC receives all packets at the wire (rx_packets_phy = 148M/s) but only delivers ~80-100M pps to the host. The rest are dropped by the NIC hardware - rx_discards_phy increments at 45-68M/s.
Some more information:
- rx_out_of_buffer = 0
- CPU utilization is only 7%
- Zero TX errors on the egress NIC (tx_xdp_full = 0, tx_xdp_err = 0)
We benchmarked different combined channel counts while keeping everything else constant:
Queue Count Scaling (key finding):
16 queues → 42.5 Mpps TX, 48.9 Mpps discards, 28.6% forwarded
32 queues → 81.5 Mpps TX, 32.3 Mpps discards, 54.8% forwarded
48 queues → 100.4 Mpps TX, 42.6 Mpps discards, 67.5% forwarded
64 queues → 93.9 Mpps TX, 54.2 Mpps discards, 63.1% forwarded
96 queues → 85.3 Mpps TX, 63.6 Mpps discards, 57.3% forwarded
127 queues → 81.0 Mpps TX, 67.9 Mpps discards, 54.4% forwarded
Performance peaks at 48 queues and degrades with more.
What we’ve tried (no significant improvement)
- Interrupt coalescing: Tested adaptive on/off, rx-usecs 3-128, rx-frames 32-512 — no change
- NAPI tuning: napi_defer_hard_irqs up to 50, gro_flush_timeout up to 200µs — no change at 127 queues
- CQE compression: CQE_COMPRESSION=AGGRESSIVE (firmware) + rx_cqe_compress on (driver) — marginal improvement
- PCIe relaxed ordering: PCI_WR_ORDERING=force_relax — no change
- Virtual lanes: NUM_OF_VL_P1=1 (reduced from 4) — no change
- MaxReadReq: Increased to 4096 — no change
- Driver private flags: tx_cqe_compress on, xdp_tx_mpwqe on, tx-push on — ~2% improvement
Questions
- Is ~100 Mpps the expected maximum XDP redirect throughput for ConnectX-6 Dx with 64-byte packets? What is the NIC’s rated small-packet forwarding capacity?
- What does rx_discards_phy incrementing with rx_out_of_buffer=0 indicate? Is this an internal port buffer overflow or a scheduling/arbitration limit?
- Are there firmware parameters or NIC configuration options we haven’t explored that could increase the packet delivery rate?
- Would upgrading to OFED 24.10 or newer firmware improve small-packet XDP performance?
- Is ConnectX-7 expected to have a higher internal pps ceiling?
Any guidance on maximizing small-packet XDP redirect throughput would be greatly appreciated.