Setting up Mellanox NVMf offload

gugnani.2 · September 15, 2019, 8:56am

Here is my configuration (for both target and client):

OS: CentOS 7.4

Kernel: 4.14.93

Adapter: Connect-X 5

MOFED: 4.5-1.0.1.0 (installed with nvmf support)

I followed the exact steps listed in this article.

Setting up the offloaded subsystem works fine:

dmesg | grep “enabling port” shows the following output:

[ 80.840122] nvmet_rdma: enabling port 1 (10.3.1.66:4420)

However, trying to connect from the client fails with this message:

nvme connect -t rdma -n nvmf_ss -a 10.3.1.66 -s 4420

Failed to write to /dev/nvme-fabrics: Input/output error

On the target, I see the following messages in dmesg:

[ 80.818518] nvmet: adding nsid 1 to subsystem nvmf_ss

[ 80.840122] nvmet_rdma: enabling port 1 (10.3.1.66:4420)

[ 1092.516129] nvmet: creating controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN 2014-08.org.nvmexpress:uuid:0e455f8f-65cd-4822-9304-2768a722e81b.

[ 1092.550328] nvmet: creating controller 2 for subsystem nvmf_ss for NQN 2014-08.org.nvmexpress:uuid:0e455f8f-65cd-4822-9304-2768a722e81b.

[ 1092.579202] nvmet_rdma: using dynamic staging buffer ffff9b91b5ab7c00

[ 1092.736825] nvmet_rdma: received IB QP event: last WQE reached (16)

[ 1229.611790] INFO: task kworker/3:2:375 blocked for more than 120 seconds.

[ 1229.625404] Tainted: G OE 4.14.93 #1

[ 1229.635680] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.

[ 1229.651418] kworker/3:2 D 0 375 2 0x80000000

[ 1229.662450] Workqueue: ib_cm cm_work_handler [ib_cm]

[ 1229.672376] Call Trace:

[ 1229.677320] ? __schedule+0x22f/0x810

[ 1229.684680] schedule+0x28/0x80

[ 1229.691007] schedule_timeout+0x26e/0x350

[ 1229.699036] wait_for_completion+0xfb/0x160

[ 1229.707445] ? wake_up_q+0x70/0x70

[ 1229.714279] flush_workqueue+0x13d/0x430

[ 1229.722143] ? nvmet_rdma_cm_reject+0x70/0x70 [nvmet_rdma]

[ 1229.733162] nvmet_rdma_queue_connect+0x634/0x9e0 [nvmet_rdma]

[ 1229.744868] ? cma_acquire_dev+0x10a/0x230 [rdma_cm]

[ 1229.754831] ? cma_req_handler+0x19d/0x880 [rdma_cm]

[ 1229.764812] cma_req_handler+0x31f/0x880 [rdma_cm]

[ 1229.774471] cm_process_work+0x22/0x130 [ib_cm]

[ 1229.783562] ? cm_get_bth_pkey.isra.67+0x36/0x90 [ib_cm]

[ 1229.794261] cm_req_handler+0x92e/0xe80 [ib_cm]

[ 1229.803367] ? __switch_to_asm+0x24/0x60

[ 1229.811248] ? __switch_to_asm+0x30/0x60

[ 1229.819116] cm_work_handler+0x113/0x13a9 [ib_cm]

[ 1229.828547] ? __switch_to_asm+0x30/0x60

[ 1229.836446] ? __switch_to_asm+0x24/0x60

[ 1229.844297] process_one_work+0x18a/0x3a0

[ 1229.852369] worker_thread+0x2e/0x390

[ 1229.859716] ? pwq_unbound_release_workfn+0xd0/0xd0

[ 1229.869509] kthread+0x119/0x130

[ 1229.876008] ? __kthread_parkme+0xa0/0xa0

[ 1229.884676] ret_from_fork+0x35/0x40

[ 1229.892389] INFO: task kworker/3:0:3055 blocked for more than 120 seconds.

Any idea what the problem could be?

MvB · September 26, 2019, 12:34am

Hello Shashank,

Many thanks for posting your question on the Mellanox Community. Our apologies for the late reply.

Give me some time to do a test in our lab. We will update you as soon as we have more information.

In the meantime, were you able to test with a more recent version of Mellanox OFED and an update f/w for the ConnectX-5?

Many thanks,

~Mellanox Technical Support.

MvB · October 9, 2019, 11:25pm

Hello Shashank,

Hope you are doing well. I was able to test on our lab setup with NVME target device, and I was not able to reproduce the issue. I had no issues connecting from the client to the target.

I follow the same document as you did → https://community.mellanox.com/s/article/howto-configure-nvme-over-fabrics–nvme-of–target-offload

As by now, we have MLNX_OFED 4.7 GA releases, I tested with that version, but I used the same OS version and kernel version you used.

One note though: Based on a known issue from our MLNX_OFED 4.2 drivers, I updated the kmod package as recommended from our RN → https://docs.mellanox.com/display/MLNXOFEDv471001/Known+Issues , Internal Ref. 1263043

My setup:

OS: CentOS Linux release 7.4.1708 (Core)

Kernel: 4.14.9-1.el7.elrepo.x86_64

Driver: MLNX_OFED_LINUX-4.7-1.0.0.1 (Installed with the following options “–add-kernel-support --with-nvmf”)

Target:

modprobe -v nvme num_p2p_queues=1

modprobe -v nvmet

modprobe -v nvmet-rdma

mkdir /sys/kernel/config/nvmet/subsystems/testsubsystem

echo 1 > /sys/kernel/config/nvmet/subsystems/testsubsystem/attr_allow_any_host

echo 1 > /sys/kernel/config/nvmet/subsystems/testsubsystem/attr_offload

mkdir /sys/kernel/config/nvmet/subsystems/testsubsystem/namespaces/1

echo -n /dev/nvme0n1 > /sys/kernel/config/nvmet/subsystems/testsubsystem/namespaces/1/device_path

echo 1 > /sys/kernel/config/nvmet/subsystems/testsubsystem/namespaces/1/enable

mkdir /sys/kernel/config/nvmet/ports/1

echo 4420 > /sys/kernel/config/nvmet/ports/1/addr_trsvcid

echo 192.168.1.121 > /sys/kernel/config/nvmet/ports/1/addr_traddr

echo “rdma” > /sys/kernel/config/nvmet/ports/1/addr_trtype

echo “ipv4” > /sys/kernel/config/nvmet/ports/1/addr_adrfam

ln -s /sys/kernel/config/nvmet/subsystems/testsubsystem/ /sys/kernel/config/nvmet/ports/1/subsystems/testsubsystem

Client:

modprobe -v nvme

modprobe -v nvme-rdma

nvme discover -t rdma -a 192.168.1.121 -s 4420

nvme connect -t rdma -a 192.168.1.121 -s 4420

nvme connect -t rdma -n testsubsystem -a 192.168.1.121 -s 4420

nvme list

Output from Client:

modprobe -v nvme

insmod /lib/modules/4.14.9-1.el7.elrepo.x86_64/extra/mlnx-nvme/host/nvme-core.ko

insmod /lib/modules/4.14.9-1.el7.elrepo.x86_64/extra/mlnx-nvme/host/nvme.ko

modprobe -v nvme-rdma

insmod /lib/modules/4.14.9-1.el7.elrepo.x86_64/extra/mlnx-nvme/host/nvme-fabrics.ko

insmod /lib/modules/4.14.9-1.el7.elrepo.x86_64/extra/mlnx-nvme/host/nvme-rdma.ko

nvme discover -t rdma -a 192.168.1.121 -s 4420

Discovery Log Number of Records 1, Generation counter 2

=====Discovery Log Entry 0======

trtype: rdma

adrfam: ipv4

subtype: nvme subsystem

treq: not specified, sq flow control disable supported

portid: 1

trsvcid: 4420

subnqn: testsubsystem

traddr: 192.168.1.121

rdma_prtype: not specified

rdma_qptype: connected

rdma_cms: rdma-cm

rdma_pkey: 0x0000

nvme connect -t rdma -n testsubsystem -a 192.168.1.121 -s 4420

nvme list

Node SN Model Namespace Usage Format FW Rev

/dev/nvme0n1 2d4b9c8b901fe7a3 Linux 1 400.09 GB / 400.09 GB 512 B + 0 B 4.14.9-1

Dmesg output on Target:

[Tue Oct 8 16:56:20 2019] nvme nvme0: pci function 0000:03:00.0

[Tue Oct 8 16:56:22 2019] nvme0n1:

[Tue Oct 8 16:56:51 2019] nvmet: adding nsid 1 to subsystem testsubsystem

[Tue Oct 8 16:57:25 2019] nvmet_rdma: enabling port 1 (192.168.1.121:4420)

[Tue Oct 8 17:01:13 2019] nvmet: creating controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:e996fc57-622d-4237-8c6e-010cd4f6b5e2.

[Tue Oct 8 17:01:50 2019] nvmet: creating controller 1 for subsystem testsubsystem for NQN nqn.2014-08.org.nvmexpress:uuid:e996fc57-622d-4237-8c6e-010cd4f6b5e2.

[Tue Oct 8 17:01:50 2019] nvmet_rdma: using dynamic staging buffer ffff88103e55fe00

[Wed Oct 9 16:16:34 2019] nvmet_rdma: received IB QP event: last WQE reached (16)

If you still are experiencing an issue based on the provided instructions and steps, please do not hesitate to open a Mellanox Support ticket through support@mellanox.com

Many thanks,

~Mellanox Technical Support

gugnani.2 · October 10, 2019, 2:04am

Hi,

After working with Mellanox Support team. we were able to solve the problem.

The problem was that we did not set num_p2p_queues to 1 and we had enabled IOMMU.

For some reason NVMf offloading does not work when IOMMU is enabled.