Counter to measure PCIe limitations

Hi,

We are using a ConnectX-5 100Gb/s
adapter with the Linux kernel driver
running a program using express data
path (XDP) on ubuntu 20. These are the
driver details printed by ethtool -i.

driver: mlx5_core
version: 5.19.0-38-generic
firmware-version: 16.34.1002 (MT_0000000011)
expansion-rom-version:
bus-info: 0000:ca:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

We are running some packet processing
throughput benchmarks. We’re able to use
a traffic generator (t-rex) connected
through a single cable with the CX-5
interface on the device under test
(DUT).

A simple packet-forwarding XDP program
on the DUT forwards 86 million packets
per second (Mpps) for 64 byte packets
with 14 cores. However, the true limit
on packets/sec to hit 100 Gbit/s is 148
Mpps. Our traffic generator runs on an
identical machine configuration, and
using DPDK can indeed hit this 148 mpps
limit.

We suspect that the limitation in
throughput arises from the combination
of PCIe and driver.

We came to know that the rx_discards_phy
counter printed by ethtool
indicates when there are drops from the
physical layer due to backpressure from
PCIe during receive operations on the
NIC. The problem is that this counter
increments both when the bottleneck is at
the PCIe and when we are CPU
bottlenecked (e.g. when we use a small
number of cores on our XDP device).

Is there any counter that directly
indicates a bottleneck in the PCIe (but
not CPU or some third component)?

Thanks for any help in advance,

Srinivas Narayana

Hello,

I can see that you are running the OS Inbox driver (5.19.0-38-generic) versus MLNX_OFED driver (Nvidia driver).

I would check that the server has been properly tuned (BIOS/OS) according to our best practices.

We have various community/knowledge posts addressing tuning based on different deployments.

You can also follow some tuning recommendations (Nvidia Mellanox NICs performance) from http://core.dpdk.org/perf-reports/.

Check the maximum PCie Gen/width the HCA supports (per FW release notes) in comparison to which slot the HCA is installed.

For mlx5 port and RoCE counters, refer to the Understanding mlx5 Linux Counters Community post (ethtool -S , section physical port counters) + our MLLNX_OFED UM.

Though, for PCie backpressure/bottleneck, we have a software called NEO-Host that performs lower end diagnosis & performance counters.

Sophie.