The performance of event APIs could be bounded by softirqs

I’m trying to use RDMA event mode for handling many connections. I found that the performance degraded when there were many send requests under event mode. This issue can be reproduced by using perf-test.

What I observed is that only 10 clients can easily make ksoftirqd thread busy and there is only 1 CPU core handling interrupts, which could be the performance bottleneck.

This issue can be reproduced with GitHub - linux-rdma/perftest: Infiniband Verbs Performance Tests

N_ITER=100000000
CLIENT_IP=10.3.1.1
pkill ib_send_lat

To launch server:

for i in $(seq 1 10); do
port=$((12345+$i))
./ib_send_lat -e -n $N_ITER -p $port &
done

To launch client:

for i in $(seq 1 10); do
port=$((12345+$i))
./ib_send_lat $CLIENT_IP -e -n $N_ITER -p $port &
done

And then, use htop to monitor server-side:

Linux Distribution:

LSB Version: :core-4.1-amd64:core-4.1-noarch:cxx-4.1-amd64:cxx-4.1-noarch:languages-4.1-amd64:languages-4.1-noarch:printing-4.1-amd64:printing-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.8.2003 (Core)
Release: 7.8.2003
Codename: Core
Linux Kernel and Version:
Linux gpu01.cluster 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

InfiniBand hardware and firmware version:

driver: mlx5_core[ib_ipoib]
version: 5.0-2.1.8
firmware-version: 16.21.2010 (MT_0000000010)
expansion-rom-version:
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

NIC: Mellanox Technologies MT27800 Family [ConnectX-5]

Hello and thank you for contacting us.

Looking at the information you shared it looks like a deeper debug than what we can provide in the community is needed here.
I would advice on opening a case VIA the support portal so our support engineers can look at this issue.
Support Email:Networking-support@nvidia.com

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.