RoCE + Switchdev?

Side note..this is on linux inbox driver, not OFED, but the cards in question are so old, so it’s more of a community question.

I have a home testlab. I am using an older ConnextX-4 adapter.
I set my roce mode to v1 as it is all I need, and also I have some devices using older ConnectX 3 cards.

echo ‘IB/RoCE v1’ > /sys/kernel/config/rdma_cm/$dev/ports/1/default_roce_mode

I notice if I put the ConnectX 4 card in “switchdev” mode
devlink dev eswitch set pci/0000:06:00.0 mode switchdev
The default_roce_mode linux kernel config FS becomes unavailable.

I suppose the end goal was to create a vSwitch with Open VSwitch on one of the representative devices, for use with Incus\LXD hypervisor.

The workaround on the ConnectX4 device I find is to use a software bridge on a probe VF device and set trust on
ip link set dev enp6s0f0np0 vf 0 trust on

On a connectx 3 card, it seems the VFs don’t like to pass traffic not from it’s own mac, there is no trust option.

Is it expected that the kernel config fs option goes away when turning on switchdev?

I was just testing NVMEoF + RDMA + some hypervisor stuff at home. Professionally I’ve only ever worked with FibreChannel. I think I want to recommend RoCE to syseng teams if I ever work in a new environment.

Does turning on switchdev on newer devices, like 6,7 or 8, impact RDMA controls over the base PF in Linux?
Is it normal to have the hypervisor run a vswitch bridge on the same port as storage traffic in a production enterprise environment?

The solution to much of this, even in home, would be to use separate cards or ports for vswitching and hypervisor storage. That would consume switch ports, though.

Or is this a bug? Not sure if it’s intentional - the configfs option going away in switchdev mode.

I’m basically curious on newer hardware - if it’s normal practice to run vswitch and virtualization hypervisor storage over the same physical ports and links with VFs, or if there can be certain limitations there too.

Coming from healthcare IT where it is common to see 16 gbit FC storage links, and 10gb networking, and the fabric genuinely struggles, even these old cards at 40gbit is mindblowingly fast. The VFs pass to VMs fine, ISCSI\SER or NVMEoF over RDMA works fine. I think we’re missing out in this industry.

For ConnectX-4, in switchdev mode, the kernel disables RDMA CM support for the base Physical Function (PF). This is expected.

This is consistent across the mlx5 adapters (CX4, 5, 6 and 7).

Please use RDMA CM on Virtual Functions (VFs) instead of the PF when in switchdev mode.

The storage traffic can run over the same port with RDMA.