I’m currently trying to get the 100G Mellanox ConnectX-6 DX working with Suricata on Debian Bookworm and when I send around 70Gbit/s of traffic via Cisco T-Rex from another machine I see a big spike on 63 ksoftirqd processes that stay at 100% once I set the link up via ip link set eth5 up
.
I also ran perf top on that process and saw first that __nf_conntrack_alloc
showed a big overhead, so I disabled iptables and conntrack (which I don’t need for the test) and now the overhead is mostly (65%) with native_queued_spin_lock_slowpath
and I struggle to get an idea why the load is so high for the IRQs.
Some basics on the system for reference:
CPU (disabled HT)
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 52 bits physical, 57 bits virtual
Byte Order: Little Endian
CPU(s): 256
On-line CPU(s) list: 0-255
Vendor ID: AuthenticAMD
BIOS Vendor ID: AMD
Model name: AMD EPYC 9754 128-Core Processor
BIOS Model name: AMD EPYC 9754 128-Core Processor CPU @ 2.2GHz
BIOS CPU family: 107
CPU family: 25
Model: 160
Thread(s) per core: 1
Core(s) per socket: 128
Socket(s): 2
NUMA:
NUMA node(s): 2
NUMA node0 CPU(s): 0-127
NUMA node1 CPU(s): 128-255
NIC
e1:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
e1:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
ethtool -i eth5
driver: mlx5_core
version: 6.1.0-17-amd64
firmware-version: 22.36.1010 (DEL0000000027)
expansion-rom-version:
bus-info: 0000:e1:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
Kernel and boot parameter
BOOT_IMAGE=/boot/vmlinuz-6.1.0-17-amd64 root=/dev/mapper/root-root ro iommu=pt processor.max_cstate=0 numa_balancing=disable intel_pstate=disable intel_idle.max_cstate=0 quiet splas
I tried to disable all the offloading and I also ran the affinity script
set_irq_affinity_cpulist.sh 128-189 eth5
The cores matche the 63 queues the NIC provides on that interface/port. (Side note, why the hack does it have 63 queues and not a multiple of 2 :p?)
I can see the traffic being received and all that, but I’m wondering why the ksoftiqrd
peaks at 100% all the time. Is 1Gbit/s per queue/core just too much for that NIC? But how should it handle 100G with the 63queues otherwise?
Thanks