Hi everyone,
I’m trying to enable RoCEv2 in ECMP mode on a dual-port ConnectX-7 and I’m hitting a permission error on the documented sysfs knob.
Environment
-
NIC: ConnectX-7 (dual-port, same PF), PCI: 0001:41:00.0
-
OS: Ubuntu 24.04 (kernel 6.8)
-
Stack: DOCA-OFED 3.1.0
-
Driver: mlx5_core (from DOCA-OFED 3.1.0)
-
Firmware: 28.46.1006 (MT_0000000892)
-
Netdevs: eno16795np0 (port1), eno16805np1 (port2)
RoCE state looks good
devlink dev param show pci/0001:41:00.0 name enable_roce
# -> cmode driverinit value true
cat /sys/class/infiniband/mlx5_0/ports/1/link_layer
# -> Ethernet
show_gids
# shows IPv4 GIDs with "v2" for both ports (e.g., index 3)
cma_roce_mode
# -> RoCE v2
Issue (ECMP):
Following the official doc’s ECMP steps, I tried to write to the ECMP sysfs attribute:
echo ndev > /sys/class/net/eno16795np0/device/roce_lag_ecmp_dev
bash: /sys/class/net/eno16795np0/device/roce_lag_ecmp_dev: Permission denied
Doc I’m following (RoCE LAG ECMP section):
Questions
-
Is …/device/roce_lag_ecmp_dev still the supported interface for PF RoCE ECMP on ConnectX-7 with DOCA-OFED 3.1.0, or has it been deprecated/locked down in favor of a devlink (or DOCA) API?
-
Are there prerequisites that would cause “Permission denied” here?
-
e.g., SR-IOV must be disabled on the PF (sriov_numvfs=0),
-
a parent netdev must exist and be UP (dummy/bond carrying the IP),
-
specific FW knobs (e.g., LAG_RESOURCE_ALLOCATION) need to be set,
-
both ports must be from the same PF/function, etc.
-
-
If that sysfs path is intentionally read-only on this stack, what’s the recommended way to achieve ECMP for PF-RDMA on CX7 today?
-
Keep using the sysfs path with a parent dev (if supported),
-
Or use a devlink/DOCA method?
-
-
Any pointers to an up-to-date example (commands) for CX7 RoCEv2 ECMP on DOCA-OFED would be much appreciated. Also, tips for validating the resulting mlx5_bond_0 and checking LAG state in debugfs would help.
Thanks in advance! Happy to provide more logs (ethtool/devlink/mlxconfig/debugfs) if needed.