I'm experiencing a performance issue on ConnectX5-Ex cards (device ID 0x1019) in the form of a limit of the packet rate to around 6Mpps with production Internet traffic.

I’m using a very special setup that involves a home-grown driver (if you’re

interested, here it is: https://github.com/snabbco/snabb/blob/master/src/apps/mellanox/connectx.lua) and I’m not expecting anyone to help me specifically with that, of course. What I’m looking for is a hint what the issue could be based on a few well-defined observations that I’m trying to describe below. I’m a bit desperate by now and any kind of help would be greatly appreciated :)

The setup is the following. I have a switch that aggregates production traffic from a set of optical taps in our network onto two 100G ports connected to a server with a ConnectX5-Ex card running my driver. The system is a Supermicro server with an AMD EPYC 7302P 16-core CPU and 16xPCIe4. I assume that PCI performance is not an issue here given the relatively low packet rate. The data rate is also well below 100Gbps. The NIC is only receiving traffic, not transmitting anything.

I’m measuring the number of bytes and packets sent by the switch and compare them to the hardware counters on the server. The traffic has the following distribution in packet sizes

8396463 : FramesTransmittedLength_eq_64

1234232229292 : FramesTransmittedLength_65_127

683028646837 : FramesTransmittedLength_128_255

279265696125 : FramesTransmittedLength_256_511

390091671373 : FramesTransmittedLength_512_1023

2433908602171 : FramesTransmittedLength_1024_1518

942049981351 : FramesTransmittedLength_1519_2047

96848 : FramesTransmittedLength_2048_4095

9 : FramesTransmittedLength_4096_8191

0 : FramesTransmittedLength_8192_9215

0 : FramesTransmittedLength_9216

What I observe is that the NIC starts dropping packets when the packet rate is around 6Mpps. I’m using the RFC2863 PPCNT register to measure the bytes and packets, specifically the if_in_octets and if_in_ucast_pkts counter. The interesting thing is that if_in_octets exactly matches the counter of transmitted bytes on the switch but if_in_ucast_pkts is smaller than the corresponding counter on the switch. if_in_discards is always zero.

Furthermore, there are no out-of-buffer drops on any of the RX queues (as per the Q_COUNTER facility). In fact, the effect does not depend on the number of RX queues at all. My conclusion is that the packets are all received correctly but some are getting dropped before they are placed on any receive queue (tested with direct and indirect TIRs). Unfortunately, I can’t find any information that would tell me why the packets are being

dropped. The only counter I found that matches the number of dropped packets is the ether_stats_drop_events counter from the RFC2819 PPCNT register but that is very unspecific. Also, I’m not receiving any events on the EQ that would signal any kind of

problem.

Another observation is that the pps limit is also independent of whether I use both ports of the NIC or just one.

Some experiments in the lab suggest that the drops are related to packet size. There are no drops below ~100-200 byte packets (measured up to ~30Mpps) and also no drops for packets larger than ~800 bytes. For packets between 200 and 800 bytes I see a varying rate of drops.

It looks like some kind of resource is exhausted on the NIC and the most likely reason is that my driver is missing something during initialization (I think I’m doing ALLOC_PAGES correctly and I also don’t get any PAGE_REQUEST events). So, my question to the experts is whether this observation on counters and packet drops rings any bells that could help me identify what’s going wrong or at least point me to the right direction.

Alex

Hello Alexander,

Thank you for posting your inquiry on the NVIDIA Networking Community.

As you mentioned, based on your setup, from our side, little support can be provided. But still want to give you some pointers.

Even though the packet rate is not high, we still strongly recommend to apply all tuning recommendation based provided in the DPDK performance reports. Even with low packet rate, PCI performance can fluctuate, especially on an AMD platform.

From the latest DPDK performance reports for Mellanox adapters, we recommend to apply the AMD and ConnectX-5 Ex tuning recommendations → http://core.dpdk.org/perf-reports/

Majority of the tuning recommendations for AMD are mentioned in the section for the ConnectX-6(Dx). For the ConnectX-5 Ex, just follow that section.

When all is applied, run the same benchmarks as mentioned in the report to create a base line.

Then run your own solution and compare the results.

Good luck.

Thank you and regards,

~NVIDIA Networking Technical Support

Hello Martijn

Thank you for taking your time and for the references. I will certainly check those options and also try to perform some PCI performance analysis on the system.

One question you might be able to answer irrespective of my actual setup would also help me a lot to better narrow down the bottleneck. Do you know which events exactly contribute to the ether_stats_drop_events counter in the firmware? I imagine that there is only a small number of conditions under which a packet is dropped after reception but before the RX queue. I’d also like to point out again that those dropped packets are accounted for in the number of bytes received but not in the number of packets (in case of the RFC2863 counters), which strikes me as odd. That might also help to identify the place in the ingress processing on the NIC where the drop happens.

Also, if you know of any other counters or feedback from the firmware that I could look at that would be very helpful.

Thanks,

Alex