CX-4 Ethernet (CX416A) RHEL7 inbox, cannot get RoCE to initialize. Anyone have a good how-to?

I have a CentOS 7.6 environment with CX4 Ethernet only card. Using MLNX 4.6-1.0.1 card drivers and inbox rdma.

The CX4 is functioning fine in Ethernet mode, has an ipv4 address, passes traffic, no problem.

I set “options roce_mode=2” in /etc/modprobe.d/mlx4.conf and verified after module loading by catting /sys/module/mlx4_core/parameters/roce_mode that indicates “2”.

When I start the inbox rdma services, the other non-Mellanox interfaces load their RoCE modules and appear functional in ibstatus. The Mellanox interface does not.

I can see that mlx4_code, mlx5_core, mlx4_ib and mlx5_ib modules are loaded but the _ib modules are not bound to any interfaces.

If I try to create their instances in configfs it fails.
mkdir /sys/kernel/config/rdma_cm/mlx5_0/
mkdir: cannot create directory ‘/sys/kernel/config/rdma_cm/mlx5_0/’: No such device

mkdir /sys/kernel/config/rdma_cm/mlx4_0
mkdir: cannot create directory ‘/sys/kernel/config/rdma_cm/mlx4_0’: No such device

The two onboard Intel i40e interfaces immediately initialize fine, load i40iw kernel module and appear as RDMA interfaces in ibstatus when I start inbox RDMA services so I know the RDMA services environment is functional. For some reason the CX416A will not come up as RDMA. Is there anything specific that has to be set in the CX416A with mstconfig? Is there a NUM_OF_VFS requirement or other card NVRAM setting required?

Does anyone have a how-to or know the magic incantation to get a CX416A to function in RoCE mode?

Thanks

Card NVRAM:

mstconfig -d 18:00.0 q

Device #1:
----------
Device type:  ConnectX4
Name:      MCX416A-BCA_Ax
Description:  ConnectX-4 EN network interface card; 40GbE dual-port QSFP28; PCIe3.0 x16; ROHS R6
Device:     18:00.0

Configurations:               Next Boot
     MEMIC_BAR_SIZE           0
     MEMIC_SIZE_LIMIT          _256KB(1)
     FLEX_PARSER_PROFILE_ENABLE     0
     FLEX_IPV4_OVER_VXLAN_PORT      0
     ROCE_NEXT_PROTOCOL         254
     NON_PREFETCHABLE_PF_BAR       False(0)
     STRICT_VF_MSIX_NUM         False(0)
     VF_NODNIC_ENABLE          False(0)
     NUM_OF_VFS             8
     FPP_EN               True(1)
     SRIOV_EN              True(1)
     PF_LOG_BAR_SIZE           5
     VF_LOG_BAR_SIZE           0
     NUM_PF_MSIX             63
     NUM_VF_MSIX             11
     INT_LOG_MAX_PAYLOAD_SIZE      AUTOMATIC(0)
     PARTIAL_RESET_EN          False(0)
     SW_RECOVERY_ON_ERRORS        False(0)
     RESET_WITH_HOST_ON_ERRORS      False(0)
     CQE_COMPRESSION           BALANCED(0)
     IP_OVER_VXLAN_EN          False(0)
     UCTX_EN               True(1)
     PCI_ATOMIC_MODE           PCI_ATOMIC_DISABLED_EXT_ATOMIC_ENABLED(0)
     LRO_LOG_TIMEOUT0          6
     LRO_LOG_TIMEOUT1          7
     LRO_LOG_TIMEOUT2          8
     LRO_LOG_TIMEOUT3          13
     LOG_DCR_HASH_TABLE_SIZE       14
     DCR_LIFO_SIZE            16384
     ROCE_CC_PRIO_MASK_P1        255
     ROCE_CC_ALGORITHM_P1        ECN(0)
     ROCE_CC_PRIO_MASK_P2        255
     ROCE_CC_ALGORITHM_P2        ECN(0)
     CLAMP_TGT_RATE_AFTER_TIME_INC_P1  True(1)
     CLAMP_TGT_RATE_P1          False(0)
     RPG_TIME_RESET_P1          300
     RPG_BYTE_RESET_P1          32767
     RPG_THRESHOLD_P1          1
     RPG_MAX_RATE_P1           0
     RPG_AI_RATE_P1           5
     RPG_HAI_RATE_P1           50
     RPG_GD_P1              11
     RPG_MIN_DEC_FAC_P1         50
     RPG_MIN_RATE_P1           1
     RATE_TO_SET_ON_FIRST_CNP_P1     0
     DCE_TCP_G_P1            1019
     DCE_TCP_RTT_P1           1
     RATE_REDUCE_MONITOR_PERIOD_P1    4
     INITIAL_ALPHA_VALUE_P1       1023
     MIN_TIME_BETWEEN_CNPS_P1      0
     CNP_802P_PRIO_P1          6
     CNP_DSCP_P1             48
     CLAMP_TGT_RATE_AFTER_TIME_INC_P2  True(1)
     CLAMP_TGT_RATE_P2          False(0)
     RPG_TIME_RESET_P2          300
     RPG_BYTE_RESET_P2          32767
     RPG_THRESHOLD_P2          1
     RPG_MAX_RATE_P2           0
     RPG_AI_RATE_P2           5
     RPG_HAI_RATE_P2           50
     RPG_GD_P2              11
     RPG_MIN_DEC_FAC_P2         50
     RPG_MIN_RATE_P2           1
     RATE_TO_SET_ON_FIRST_CNP_P2     0
     DCE_TCP_G_P2            1019
     DCE_TCP_RTT_P2           1
     RATE_REDUCE_MONITOR_PERIOD_P2    4
     INITIAL_ALPHA_VALUE_P2       1023
     MIN_TIME_BETWEEN_CNPS_P2      0
     CNP_802P_PRIO_P2          6
     CNP_DSCP_P2             48
     LLDP_NB_DCBX_P1           False(0)
     LLDP_NB_RX_MODE_P1         OFF(0)
     LLDP_NB_TX_MODE_P1         OFF(0)
     LLDP_NB_DCBX_P2           False(0)
     LLDP_NB_RX_MODE_P2         OFF(0)
     LLDP_NB_TX_MODE_P2         OFF(0)
     DCBX_IEEE_P1            True(1)
     DCBX_CEE_P1             True(1)
     DCBX_WILLING_P1           True(1)
     DCBX_IEEE_P2            True(1)
     DCBX_CEE_P2             True(1)
     DCBX_WILLING_P2           True(1)
     KEEP_ETH_LINK_UP_P1         True(1)
     KEEP_IB_LINK_UP_P1         False(0)
     KEEP_LINK_UP_ON_BOOT_P1       False(0)
     KEEP_LINK_UP_ON_STANDBY_P1     False(0)
     KEEP_ETH_LINK_UP_P2         True(1)
     KEEP_IB_LINK_UP_P2         False(0)
     KEEP_LINK_UP_ON_BOOT_P2       False(0)
     KEEP_LINK_UP_ON_STANDBY_P2     False(0)
     NUM_OF_VL_P1            _4_VLs(3)
     NUM_OF_TC_P1            _8_TCs(0)
     NUM_OF_PFC_P1            8
     NUM_OF_VL_P2            _4_VLs(3)
     NUM_OF_TC_P2            _8_TCs(0)
     NUM_OF_PFC_P2            8
     DUP_MAC_ACTION_P1          LAST_CFG(0)
     SRIOV_IB_ROUTING_MODE_P1      LID(1)
     IB_ROUTING_MODE_P1         LID(1)
     DUP_MAC_ACTION_P2          LAST_CFG(0)
     SRIOV_IB_ROUTING_MODE_P2      LID(1)
     IB_ROUTING_MODE_P2         LID(1)
     PCI_WR_ORDERING           per_mkey(0)
     MULTI_PORT_VHCA_EN         False(0)
     PORT_OWNER             True(1)
     ALLOW_RD_COUNTERS          True(1)
     RENEG_ON_CHANGE           True(1)
     TRACER_ENABLE            True(1)
     IP_VER               IPv4(0)
     BOOT_UNDI_NETWORK_WAIT       0
     UEFI_HII_EN             False(0)
     BOOT_DBG_LOG            False(0)
     UEFI_LOGS              DISABLED(0)
     BOOT_VLAN              1
     LEGACY_BOOT_PROTOCOL        PXE(1)
     BOOT_RETRY_CNT           NONE(0)
     BOOT_LACP_DIS            True(1)
     BOOT_VLAN_EN            False(0)
     BOOT_PKEY              0
     EXP_ROM_UEFI_ARM_ENABLE       False(0)
     EXP_ROM_UEFI_x86_ENABLE       False(0)
     EXP_ROM_PXE_ENABLE         True(1)
     ADVANCED_PCI_SETTINGS        False(0)
     SAFE_MODE_THRESHOLD         10
     SAFE_MODE_ENABLE          True(1)

Hello @jeff.johnson,

Thank you for posting your query on our community. Please note that we provide support for MLNX_OFED drivers. If using inbox drivers, you will need to reach out to the OS vendor for further assistance.

ConnectX-4 uses mlx5 driver. Our mlx5 driver for ConnectX-4/5 will support RoCEv2 and RoCEv1. By default when using RDMA_CM in conjunction with our mlx5 driver for ConnectX-4/5 it uses RoCEv2.
To change the default RoCE mode for RDMA_CM, you will need to use the cma_roce_mode command.

For ex:
To check the default RoCE mode,

cma_roce_mode -d mlx5_0 -p 1

IB/RoCE V1

To set the default RoCE mode, use -m 2 parameter.

cma_roce_mode -d mlx5_0 -p 1 -m 2

RoCE V2

Hope this answers your question.

Regards,
Bhargavi