Have problem with new Mellanox ConnectX-4 Lx EN 50Gbps.
Use it on old server Dual X5650/128Gb DDR3 1333/PCI-E 2.0 x8.
Default settings on RHEL 8.3 + this:
ethtool --set-priv-flags eth2 rx_cqe_compress on
ethtool -C eth2 adaptive-rx off
ethtool -G eth2 rx 8192 tx 8192
setpci -s 06:00.0 68.w=5936
ethtool -A eth2 autoneg off rx off tx off
ifconfig eth2 txqueuelen 20000
ethtool -L eth2 combined 12
service irqbalance stop
<irq smp_affinity to 12 cores with NUMA node #0, as card)>
I test card with XDP program XDP_DROP, and see errors in ethtool -S and packet lose:
rx_xdp_drop: 3801290644
rx_discards_phy: 1296930300
rx_buffer_passed_thres_phy: 7049089607
rx_pci_signal_integrity: 0
tx_pci_signal_integrity: 12
outbound_pci_stalled_rd: 0
outbound_pci_stalled_wr: 0
outbound_pci_stalled_rd_events: 0
outbound_pci_stalled_wr_events: 1076
rx_discards_phy grows along with rx_xdp_drop, amounting to about 27%. outbound_pci_stalled_wr is in the range 50-70. outbound_pci_stalled_wr_events is growing.
Test traffic 6Mpps / 3Gbps, of which ~ 1.7Mpps are dropped. What am I doing wrong? Thanks.