ConnectX increase RX buffer miss counter with max buffer size

Hi there,

I have a network consisting of Ryzen servers running ConnectX 4 Lx (MT27710 family) which run a fairly intense workload involving a lot of small packet websockets traffic. We’re noticing the rx_prio0_discards counter is continuing the climb even after we’ve replaced the NIC and increased the ring buffer to 8192

Ring parameters for enp65s0f1np1:

Pre-set maximums:

RX: 8192

RX Mini: 0

RX Jumbo: 0

TX: 8192

Current hardware settings:

RX: 8192

RX Mini: 0

RX Jumbo: 0

TX: 8192

Any ideas or suggestions here?

Hello Alexander,

Thank you for posting your question on the Mellanox Community.

With discards increasing there is a chance that the adapter or node is not processing packets quickly enough. We would advise aligning your system in accordance with our tuning guide which can be found here:

https://community.mellanox.com/s/article/performance-tuning-for-mellanox-adapters

The following tuning guide may also help as you are using AMD architecture. Please note that it may not be 100% applicable as it is more directed at the AMD EPYC datacenter processors rather than the Ryzen line.

https://support.mellanox.com/s/article/how-to-tune-an-amd-server--eypc-cpu--for-maximum-performance

You may also be able to improve performance by enabling iommu passthrough. We have seen improvement in the past when using this option with AMD EYPC processors.

https://support.mellanox.com/s/article/understanding-the-iommu-linux-grub-file-configuration

The following page may also help to better understand the counters:

https://community.mellanox.com/s/article/understanding-mlx5-ethtool-counters

If you need any further assistance, please open a case through the Mellanox support portal using a account with a valid support contract

https://support.mellanox.com/s/

Thanks and regards,

~Mellanox Technical Support