ConnectX-2: Enabling RSS (Receive Side Scaling) in IPoIB mode

Hello, I’m using Mellanox ConnectX-2 40Gb/s adapters for filtering IP traffic on an HP 4xOpteron server. The server receives traffic via an HP Infiniband-only switch, so using mlx4_en is not an option.

I’m aiming to handle up to 10Gb/s of traffic. Currently the traffic hits a wall at ~6Gb/s as soon as I enable a netfilter rule that returns NF_ACCEPT immediately (in ‘connected’ mode). It goes down from there as soon as I do more processing (look at the payloads, distribute to workqueues, queue to usermode, etc). And ‘top’ shows only 1-2 CPUs being in use.

When just routing IP traffic (without filters), reaching 10Gb/s is not a problem (top shows 2-3% CPU load). I used the standard MLNX_OFED 1.5.3 (rhel-6.2-amd64).

To find out if there is additional bandwidth that Linux can’t handle, I did a quick test on win2008-r2 and the windows drivers (MLNX_VPI_WinOF-4.2 from the HP website) handled around 14Gb/s, and processing was distributed to at least 16 CPUs (according to taskmgr).

My questions are:

  • can the mlx4_ib driver take advantage of the hardware queues support (RSS) ? Apparently it’s what makes the difference on windows

  • if RSS is available, how can I enable it ? I’ve tried setting the interrupt affinity, disabling cpu scaling, enabling RSS, RPS, RFS, with no success so far (by following the steps from the performance tuning PDF from mellanox, and linux/Documentation/networking/scaling.txt)

Thanks,

Bogdan

Thanks so much for the question. Let me see if I can find the right person to look at this.

Bogdan,

Check our performance tuning guide, it should help: http://www.mellanox.com/related-docs/prod_software/Performance_Tuning_Guide_for_Mellanox_Network_Adapters_v1.7.pdf http://www.mellanox.com/related-docs/prod_software/Performance_Tuning_Guide_for_Mellanox_Network_Adapters_v1.7.pdf

If that doesn’t give you pointers, let me know. I’ll continue to help you.

MOFED 2.0 includes IPoIB RSS support for datagram mode.

Try to upgrade your software to MOFED 2.0, then check your CPU utilization when running IPoIB datagram (echo datagram > /sys/class/net/ibX/mode).

Thanks for the link, I didn’t notice MLNX_OFED 2.0 was already out. I went through the tuning steps from the MLNX_OFED-1.5.3 release, and got the numbers above (11.4Gbps for plain IP forwarding, 6Gbps for forwarding + a netfilter kernel hook that ACCEPTs every packet).

I noticed a new mlnx_affinity script was introduced in MLNX_OFED 2, if that does something different than what I can do via /proc/irq/x/smp_affinity then I will give it a try.

Note that I specified that I can’t use mlnx_en since I have an IB-only switch. My question remains: are hardware queues available when running in IPoIB mode ? The 2.0 release notes mention “Flow Steering for Ethernet and InfiniBand” was introduced. Should I take this to mean ‘yes’ ?

Thanks

Bogdan

We have and it’s a lot faster, but I don’t see how using SDP on a router that forwards IP traffic would work.

UDP hardware queues work in ConnectX-2, but there are more flow steering / RSS features in ConnectX-3.

Have you looked at SDP to just not use tcp all together?

Has worked great for me a year or two ago