Hi,
I am doing some study that requires inserting more than 1 million flow rules per second to the NIC. And I runs DPDK 19.11 on a ConnectX-5 NIC.
But I only managed to create around 3.3K rules per second. Below is the code I used to measure the insertion rate:
uint16_t mask = UINT16_MAX;
uint64_t timer_start = rte_get_tsc_cycles();
for (int udp = 0; udp < num_rules; udp++)
// just a simple wrapper of rte_flow_validate() & rte_flow_create()
// (Removing validation seems to have little impact on performance)
// Each rule basically assigns udp packets with specific dst port value to a RX queue
// 1st arg => NIC port
// 2nd arg => Priority (This doesn’t matter to insertion rate according to my observation)
// 3rd arg => dst udp port spec
// 4th arg => dst udp port mask
// 5th arg => queue index
generate_dst_udp_flow(0, 1, udp % UINT16_MAX, mask, udp % 12);
uint64_t timer_val = rte_get_tsc_cycles() - timer_start;
printf(“[BENCH] Create %d udp flow takes %ld us\n”, num_rules,
timer_val * 1000000 / rte_get_tsc_hz());
With 60000 rules I got [BENCH] Create 60000 udp flow takes 17821419 us. So 300us for one insertion… Which is too slow…
According to the mlx5 PMD manual (http://doc.dpdk.org/guides-19.11/nics/mlx5.html), insertion rate should be much higher:
“Flow insertion rate of more then million flows per second, when using Direct Rules.”
And this has been introduced since DPDK 19.05 (See the release note http://doc.dpdk.org/guides-19.05/rel_notes/release_19_05.html#new-features and the patch note here https://mails.dpdk.org/archives/dev/2019-February/125303.html).
Did I miss anything? How can I get the promised 1 million flows per sec?
My setup is as following:
-
CPU: E5-2697 v3 (14 cores, SMT disabled, CPU frequency fixed @ 2.6 GHz)
-
NIC: Mellanox MCX515A-CCAT (installed on PCIe Gen3 x16)
-
DPDK: 19.11
-
OFED: 4.7-3.2.9.0 with upstream libs (I also tried standalone RDMA-CORE: v28.0 instead of the one in OFED but got similar results)
-
Kernel: 4.15
-
OS: Ubuntu 18.04
-
Firmware: 16.26.1040
The firmware/driver/dpdk are tuned in the same way as here (http://fast.dpdk.org/doc/perf/DPDK_19_11_Mellanox_NIC_performance_report.pdf).
Your feedbacks will be much appreciated.
Thanks,
Lei