How to find the maximum number of RX Queues for a NIC (ConnectX-5)?

wichtounet · March 19, 2021, 3:34pm

Hi,

I am using DPDK with an MLX5 card (MCX515A-CCAT single port 100Gbe).

I seem to be hitting a wall when trying to use more than 32 RX Queues. I get good performance with 32 RX Ques, but it drops very significantly when I go over to 36 Queues.

Is there a limit of 32 RX Queues per port? I have not been able to find this in the documentation.

Is this limit configurable or there is nothing I can do?

Would I be able to go higher with ConnextX-6?

Thanks!

abigail2 · March 28, 2021, 12:43am

Hello Baptiste,

Thank you for posting your question on the Mellanox Community.

To help best answer your question, please answer the following questions:

which version of DPDK are you currently using?
Which version of the Mellanox OFED are you using? You can see the version you are using with the command # ofed_info -s
Which firmware are you using on your adapter? To get your adapters firmware version run the following commands:

# mst start

# mst status

Then use the outputted device in the command # flint -d <mst_device> q

For example:

# flint -d /dev/mst/mt4119_pciconf0 q

Thanks and regards,

~Mellanox Technical Support

abigail2 · April 1, 2021, 3:57pm

Hello Baptiste,

Thank you for posting your question on the Mellanox Community.

When using more than 32 queues on NIC Rx, the probability for WQE miss on the Rx buffer increases. In answer to your question this would also apply to the ConnectX-6

To determine if the the performance decrease is due to hardware or software you should check the out_of_buffer counter.

This counter counts the number of times the NIC wanted to scatter packet but there was no receive WQE. When it is ~0 it means the SW is not the bottleneck. You can find more information on counters here:

https://community.mellanox.com/s/article/understanding-mlx5-ethtool-counters

This behavior can be seen with lesser amount of queues (up to 32) if the system is not tuned according to the benchmark reports which can be found on the DPDK website. Here is the report for DPDK 20.11:

For best performance please test using the settings used in the report.

Another thing to note is that as of DPDK 18.05 and Mellanox OFED 4.3 support has been added for stride RQ. Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe bandwidth by posting a single large buffer for multiple packets. Instead of posting a buffers per a packet, one large buffer is posted in order to receive multiple packets on the buffer. A MPRQ buffer consists of multiple fixed-size strides and each stride receives one packet. MPRQ can improve throughput for small-packet traffic. You can test using this feature for better performance by setting the parameter mprq_en=1.

For more information on this parameter please see section 32.5.3. on this page:

https://dpdk-power-docs.readthedocs.io/en/latest/nics/mlx5.html

You can also potentially further improve performance by improving CQE compression ratio using the following commands:

sudo mcra mlx5_0 0x815e0.0 0xcff0f3ff

sudo mcra mlx5_0 0x81600.0 0xcff0f3ff

sudo mcra mlx5_0 0x815e8.31 0

sudo mcra mlx5_0 0x81608.31 0

In the above commands mlx5_0 is used as an example you can get your actual RDMA ports by using the commands:

mst start

mst status -v

These settings are active unless the machine is rebooted, please make sure you have MFT installed (this is installed with the Mellanox OFED) and the CQE compression mode is set to AGGRESSIVE. You can set this with the command mlxconfig -d <PCIe_address> s CQE_COMPRESSION=1

For general tuning recommendations with our adapters please see the following tuning guide:

https://community.mellanox.com/s/article/performance-tuning-for-mellanox-adapters

Further analysis of this would require engineering investigation. If you wish to further pursue this please contact support with a valid support contract through the Mellanox support portal found here:

https://support.mellanox.com/s/

Thanks and regards,

~Mellanox Technical Support

wichtounet · April 7, 2021, 8:39am

HI Abigail,

Thanks a lot for your reply.

Why would the probably for a miss increase after 32 RX Queues? Could that be a significant drop? Our performance is dropping by more than 20% from 32 Queues to 33 Queues.

I have currently no access to the benchmark machine, but I will check the out_of_buffer stats next week.

We have seen that MPRQ support was added. Unfortunately, we currently need the hash result and since we have compression enabled, it seems that the hash is not fully supported with MPRQ. So, we have not done any test with MPRQ currently. I will try to run a test with our benchmark code.

I will try to tune the compression ratios. We already have aggressive compression enabled.

I will go over the different documents and see if we can find something, but I have already been through them before and it seems like we have made proper tuning. I just can’t seem to go over 88Mpps

Best regards

Baptiste

wichtounet · April 7, 2021, 8:47am

Hi Abigail,

Thanks for your answer

I am using 19.11
We are using in-tree drivers not OFED
The firmware version is 16.20.1010

Best regards

Baptiste

Topic		Replies	Views
rx-out-of-buffer Ethernet Adapter Cards	10	3692	May 6, 2019
ConnectX-6 Dx NIC Performance Issue - rx_prio0_buf_discard Metric Increase Ethernet Adapter Cards performance , dpdk	11	2504	December 19, 2024
Experiencing low performance on Mellanox ConnectX-6DX Mellanox OFED boot , kernel , offload-features	2	827	April 1, 2024
ConnectX-4 RX performance issues on DPDK Application Accelerator Software tx , rx	2	1200	March 25, 2021
DPDK rte_flow is degrading performance when testing on Connect X5 100G EN @ 100G Ethernet Adapter Cards	6	1376	February 23, 2021
Line rate using Connect_X5 100G EN in Ubuntu; PCIe speed difference; Ethernet Adapter Cards	10	874	October 4, 2018
mlx5_0 port is Down ! InfiniBand/VPI Adapter Cards	9	3562	April 12, 2024
Configuring NIC with 0 RX queue (TX queues only) on DPDK 21.11 Software And Drivers	4	1990	March 7, 2022
ConnectX6 DPDK dpdk-testpmd Receive tcp ,udp Mixed flow performance is very low! Software And Drivers	2	895	January 31, 2022
DPDK not working with ConnectX-3 card on Openstack virtual setup Software And Drivers openstack , port	8	999	May 10, 2019

How to find the maximum number of RX Queues for a NIC (ConnectX-5)?

Related topics