ConnectX-2: Enabling RSS (Receive Side Scaling) in IPoIB mode

Hello, I’m using Mellanox ConnectX-2 40Gb/s adapters for filtering IP traffic on an HP 4xOpteron server. The server receives traffic via an HP Infiniband-only switch, so using mlx4_en is not an option.

I’m aiming to handle up to 10Gb/s of traffic. Currently the traffic hits a wall at ~6Gb/s as soon as I enable a netfilter rule that returns NF_ACCEPT immediately (in ‘connected’ mode). It goes down from there as soon as I do more processing (look at the payloads, distribute to workqueues, queue to usermode, etc). And ‘top’ shows only 1-2 CPUs being in use.

When just routing IP traffic (without filters), reaching 10Gb/s is not a problem (top shows 2-3% CPU load). I used the standard MLNX_OFED 1.5.3 (rhel-6.2-amd64).

To find out if there is additional bandwidth that Linux can’t handle, I did a quick test on win2008-r2 and the windows drivers (MLNX_VPI_WinOF-4.2 from the HP website) handled around 14Gb/s, and processing was distributed to at least 16 CPUs (according to taskmgr).

My questions are:

  • can the mlx4_ib driver take advantage of the hardware queues support (RSS) ? Apparently it’s what makes the difference on windows

  • if RSS is available, how can I enable it ? I’ve tried setting the interrupt affinity, disabling cpu scaling, enabling RSS, RPS, RFS, with no success so far (by following the steps from the performance tuning PDF from mellanox, and linux/Documentation/networking/scaling.txt)



Check our performance tuning guide, it should help:

MOFED 2.0 includes IPoIB RSS support for datagram mode.

Try to upgrade your software to MOFED 2.0, then check your CPU utilization when running IPoIB datagram (echo datagram > /sys/class/net/ibX/mode).

Thanks for the link, I didn’t notice MLNX_OFED 2.0 was already out. I went through the tuning steps from the MLNX_OFED-1.5.3 release, and got the numbers above (11.4Gbps for plain IP forwarding, 6Gbps for forwarding + a netfilter kernel hook that ACCEPTs every packet).

I noticed a new mlnx_affinity script was introduced in MLNX_OFED 2, if that does something different than what I can do via /proc/irq/x/smp_affinity then I will give it a try.

Note that I specified that I can’t use mlnx_en since I have an IB-only switch. My question remains: are hardware queues available when running in IPoIB mode ? The 2.0 release notes mention “Flow Steering for Ethernet and InfiniBand” was introduced. Should I take this to mean ‘yes’ ?



We have and it’s a lot faster, but I don’t see how using SDP on a router that forwards IP traffic would work.

UDP hardware queues work in ConnectX-2, but there are more flow steering / RSS features in ConnectX-3.

Have you looked at SDP to just not use tcp all together?

Has worked great for me a year or two ago