I’m testing a Mellanox 100Gbe ConnectX-6 with xdp-bench drop. After reaching about 110Mpps (64 bytes), rx_discards_phy counter starts increasing. Even with bigger packets (256 bytes), I can’t reach line rate. CPU usage is <5%. Is there a way to track what is causing the discards?
- NIC model: CX653105A (fw 20.42.1000)
- LnkSta: Speed 8GT/s (ok), Width x16 (ok)
- OFED driver version: 24.07
- Kernel: 6.10.8
- CPU: dual Intel Platinum 8273CL (28 cores) (queues are pinned to the local CPU)
- mlxn_tune is executed at startup
- CPU BIOS performance set to Maximum Performance
- PCIe max read request size increased from 512 to 4096
- CQE_COMPRESSION=1
- HyperThreading is disabled
I noticed that if I set hfunc to xor instead of toeplitz, I can reach line rate with 256 bytes packets, but not with 64 bytes (but this is because some cores goes to 100% probably due to traffic polarization). In fact, it seems like toeplitz hashing reduces the NIC performance.
I also noticed that by restricting assigned cores to 16 I can reach about 120Mpps:
ethtool -L enp59s0np0 combined 16
set_irq_affinity_cpulist.sh 0-15 enp59s0np0