High ksoftirqd load on Mellanox ConnectX-4 Lx on Linux

I’m running xdp service on Mellanox Technologies MT27710 Family [ConnectX-4 Lx]. Just when the service is started and no request, the cpu usage of ksoftirqd processes suddently raises to100% nearly (using top).

Some settings on the system for reference:

kernel: 5.10.0-136.12.0.86.ctl3.x86_64
irqbalance is running.

NIC:
driver: mlx5_core
version: 23.10-2.1.3
firmware-version: 14.32.1010 (ZTE0000000002)
*expansion-rom-version: *
bus-info: 0000:61:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

NIC channel: 2 queue
ethtool -l ens3f0np0
Channel parameters for ens3f0np0:
Pre-set maximums:
RX: n/a
TX: n/a
Other: n/a
Combined: 63
Current hardware settings:
RX: n/a
TX: n/a
Other: n/a
Combined: 2

XDP is attached to the NIC.

I’m wondering why the ksoftiqrd peaks at 100% when the xdp service is started and no incoming request. Any suggestion is appreciated.

Thanks!

Hi houguanghua,

Thank you for posting your query on NVIDIA community!

Based on the review of information shared, you are using an OEM Adapter ZTE0000000002 (PSID starting with ZTE) . In such scenarios, we recommend reaching out to the OEM first as the firmware specific to OEM’s can have customizations that may be different than the non-OEM Adapters(starting with PSID “MT_”).

Thanks,
Namrata.