DPDK multicast loopback to source port (blocking it)

Hi,

I am using DPDK 16.11 with Mellanox 4.0-1.5.2.0 kernel drivers w/ linux 3.4 kernel (also tried 4.4 kernel)

in a VM running on QEMU/KVM w/ PCI passthru.

ConnectX-3 Pro EN

Firmware: 2.40.5000

Board ID: MT_1060111023

ConnectX-4 EN

Firmware: 12.18.2000

Board ID: MT_2150110033

I have both the ConnectX-3 and ConnectX-4 DPDK drivers working but one major issue for our software is that

multicast traffic is getting looped back to the same port that sends the packets.

The ConnectX-3 (mlx4) driver seems to have issues when traffic is sent from a SR-IOV VF.

Multicast traffic from the PF interface does not have any issues.

The ConnectX-4 (mlx5) driver PF and VF seems to have issues when sending multicast traffic from secondary processes.

I have seen the issue also happen with multiple tx queues in the primary process, but it is not easy to reproduce this issue

with only a primary process.

The ConnectX-3 issue is mentioned in the Mellanox DPDK release notes, but there is no issue mentioned for ConnectX-4.

Is there any way to prevent this from happening?

If we have to filter packets in software this will be a performance hit.

I tried modifying the dpdk/kernel driver code for the ConnectX-3 to enable the block IBV multicast loopback option,

but it wasn’t working reliably. It stopped working when I reboot the VMs sometimes.

The ConnectX-4 doesn’t seem to have an explicit options to block multicast loopback traffic,

so it is not clear why secondary processes are having issues.

Thanks,

Charles

Hello Charles,

Could you clarify additional details of the issue, please?

What are the secondary and primary processes? Is it two processes over the PF/VF or it is one on PF and the second on VF?

What is the application flow?

What kind of multicast address?

What is the way to reproduce the issue?

Is it happens on 32-bit/64-bit or just 64-bit only?

Does the issue happens with the latest 4.0-2 version of MOFED and DPDK 16.11.2_3 and latest 16.19.1200 firmware?

The primary/secondary processes use the same interface.

Secondary process(es) are used for transmit. The primary process is used for transmit and receive.

The issue is reproducible with any multicast/broadcast mac address.

The issue happens w/ both 32-bit and 64-bit applications.

We are using ConnectX-3 and ConnectX-4 latest firmware 2.40.5030, 12_18_2000 as recommended in Mellanox DPDK release notes.

We don’t have a ConnectX-5 (16.19.1200).

We are currently still using 4.0-1.5.2.0 kernel drivers as mentioned in original post which were latest drivers at the time 3/16.

It looks like 4.0-2 was only just released at end of march and hasn’t even been pushed to your github repo yet.

I have attached a sample app which reproduces issue with broadcast MAC FF:FF:FF:FF:FF:FF.

Launch two instances of the application to reproduce the issue.

The first instance is launched as primary and sends a packet and then continuously reads the rx ring.

./build/app/mlx-mcast-test --proc-type primary -c 0x1

The second process is launched as secondary and sends a packet and exits.

./build/app/mlx-mcast-test --proc-type secondary -c 0x2

All packets which are transmitted and received are printed out.

The multicast issue can be observed when the secondary process sends a packet and the primary process receives it.