"Protocol not supported" when trying to add rdma to nfs portlist

I am trying to configure NFS for our infiniband network, and following the instructions at HowTo Configure NFS over RDMA (RoCE) https://community.mellanox.com/s/article/howto-configure-nfs-over-rdma--roce-x

I installed the MLNX_OFED drivers on CentOS 6.8. (I had originally configured the network and IPoIB interface using the RHEL manual (Part II. InfiniBand and RDMA Networking Part III. InfiniBand and RDMA Networking Red Hat Enterprise Linux 7 | Red Hat Customer Portal ) and was using NFS over the IPoIB but was receiving a bunch of page allocation failures)

I used the mlnxofedinstall script which completed successfully and updated the firmware, e.g.:

Device (84:00.0):

84:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

Link Width: x8

PCI Link Speed: 8GT/s

Installation finished successfully.

Preparing… ########################################### [100%]

1:mlnx-fw-updater ########################################### [100%]

Added 'RUN_FW_UPDATER_ONBOOT=no to /etc/infiniband/openib.conf

Attempting to perform Firmware update…

Querying Mellanox devices firmware …

Device #1:


Device Type: ConnectX3

Part Number: MCX354A-FCB_A2-A5

Description: ConnectX-3 VPI adapter card; dual-port QSFP; FDR IB (56Gb/s) and 40GigE; PCIe3.0 x8 8GT/s; RoHS R6

PSID: MT_1090120019

PCI Device Name: 84:00.0

Port1 GUID: e41d2d03006f89f1

Port2 GUID: e41d2d03006f89f2

Versions: Current Available

FW 2.32.5100 2.36.5150

PXE 3.4.0306 3.4.0740

Status: Update required


Found 1 device(s) requiring firmware update…

Device #1: Updating FW … Done

Restart needed for updates to take effect.

Log File: /tmp/MLNX_OFED_LINUX-3.4-1.0.0.0.17971.logs/fw_update.log

Please reboot your system for the changes to take effect.

To load the new driver, run:

/etc/init.d/openibd restart

I rebooted the system and then ran the self test:

hca_self_test.ofed

---- Performing Adapter Device Self Test ----

Number of CAs Detected … 1

PCI Device Check … PASS

Kernel Arch … x86_64

Host Driver Version … MLNX_OFED_LINUX-3.4-1.0.0.0 (OFED-3.4-1.0.0): 2.6.32-642.el6.x86_64

Host Driver RPM Check … PASS

Firmware on CA #0 VPI … v2.36.5150

Host Driver Initialization … PASS

Number of CA Ports Active … 0

Port State of Port #1 on CA #0 (VPI)… INIT (InfiniBand)

Port State of Port #2 on CA #0 (VPI)… DOWN (InfiniBand)

Error Counter Check on CA #0 (VPI)… FAIL

REASON: found errors in the following counters

Errors in /sys/class/infiniband/mlx4_0/ports/1/counters

port_rcv_errors: 93

Kernel Syslog Check … PASS

Node GUID on CA #0 (VPI) … e4:1d:2d:03:00:6f:89:f0

------------------ DONE ---------------------

As you can see there is an error with the port_rcv_errors counter. Also the port state for Port #1 will remain at INIT until i start the subnet manager (/etc/init.d/opensmd start) since we have unmanaged switch. That used to start automatically. So maybe the OFED installation wasn’t completely successful?

Additionally, i am unable to configure NFS for RDMA. e.g.:

echo rdma 20049 > /proc/fs/nfsd/portlist

-bash: echo: write error: Protocol not supported

The solution was to remove MLNX_OFED and use the distribution’s drivers/kernel modules.

it seems the port_rcv_errors error is based on the subnet manager not running, as the counter has not increased anymore since OpenSM was started. I ran several RDMA verification tests which were all successful. So i think that just leaves the RDMA support in NFS.

The kernel is 2.6.32-642.11.1.el6.x86_64 and in the /boot/config-2.6.32-642.11.1.el6.x86_64 file it seems RDMA is enabled:

CONFIG_RDS_RDMA=m

CONFIG_NET_9P_RDMA=m

CONFIG_CARDMAN_4000=m

CONFIG_CARDMAN_4040=m

CONFIG_INFINIBAND_OCRDMA=m

CONFIG_SUNRPC_XPRT_RDMA_CLIENT=m

CONFIG_SUNRPC_XPRT_RDMA_SERVER=m

modprobe svcrdma

/etc/init.d/nfs restart

… [ OK ]

echo rdma 20049 > /proc/fs/nfsd/portlist

-bash: echo: write error: Protocol not supported

My apologies for resurrecting an old thread, but why was the NFSoRDMA support removed from the latest MLNX OFED drivers???

I don’t understand that.