Does anyone who had the same issue when enabling hw offload for nvmet?
314.425940] nvmet: adding nsid 1 to subsystem testsub
[ 314.427031] nvmet: adding nsid 2 to subsystem testsub
[ 314.428111] nvmet: adding nsid 3 to subsystem testsub
[ 314.429861] nvmet: adding nsid 4 to subsystem testsub
[ 314.431758] nvmet: adding nsid 5 to subsystem testsub
[ 314.432838] nvmet: adding nsid 6 to subsystem testsub
[ 314.433888] nvmet: adding nsid 7 to subsystem testsub
[ 314.435059] nvmet: adding nsid 8 to subsystem testsub
[ 314.436122] nvmet: adding nsid 9 to subsystem testsub
[ 314.437276] nvmet: adding nsid 10 to subsystem testsub
[ 314.441046] nvmet_rdma: enabling port 1 (192.168.200.13:4420)
[ 322.952806] nvmet: creating nvm controller 1 for subsystem testsub for NQN nqn.2014-08.org.nvmexpress:uuid:e9184000-7983-11ec-8000-d85ed34e1861.
[ 324.192974] nvme nvme10: creating 63 I/O queues.
[ 335.349806] nvmet_rdma: using dynamic staging buffer 00000000191efefc
[ 335.403750] nvme 0000:21:00.0: Failed to get peer resource xrq=00000000a7817acc be_ctrl=000000009341a12e
[ 335.403790] nvmet_rdma: failed to get XRQ for queue (1)
[ 335.403806] nvmet: failed to install queue 1 cntlid 1 ret 4006
[ 336.547842] nvme nvme10: mapped 63/0/0 default/read/poll queues.
[ 336.607738] nvme nvme10: Connect command failed, error wo/DNR bit: 6
[ 336.608835] nvme nvme10: failed to connect queue: 1 ret=16390
[ 347.356444] nvme nvme10: rdma connection establishment failed (-104)
Hi @lpopovics ,
Please ensure that you refer to this documentation for configuring the nvme-of target offload: ESPCommunity.
When configuring the target, it is important to note that the command “modprobe nvme num_p2p_queues=1” should be executed.
This command sets the number of I/O queues that can be used for peer-to-peer.
To determine the actual number of I/O queues available for peer-to-peer, you can read the “num_p2p_queues” sysfs entry.
If the command “cat /sys/block/<nvme_device>/device/num_p2p_queues” shows zero, it is necessary to remove the nvme module before setting “modprobe nvme num_p2p_queues=1”.
This can be done using the command “modprobe -rv nvme”.
Best regards,
Chen
I set the num_p2p_queues to 30, as I have 10 devices per host and communicating with two other hosts. Is there any documentation available for the parameters how I can tune them according my setup? I used the setup of the guide which available on the portal, but not every parameter available what the documentation refers. (offload_mem_start)
https://mellanox.my.site.com/mellanoxcommunity/s/article/howto-configure-nvme-over-fabrics--nvme-of--target-offload
One addition, on the target host I see these messages:
[ 105.487209] nvmet_rdma: enabling port 1 (192.168.200.11:4420)
[ 114.488300] nvmet: creating nvm controller 1 for subsystem delkvmsrv01 for NQN nqn.2014-08.org.nvmexpress:uuid:e9184000-7983-11ec-8000-d85ed34e19d9.
[ 127.454630] nvmet_rdma: using dynamic staging buffer 000000004289d44f
[ 127.508490] nvme 0000:21:00.0: Failed to get peer resource xrq=000000002642df80 be_ctrl=000000001efd7ba9
[ 127.508544] nvmet_rdma: failed to get XRQ for queue (1)
[ 127.508562] nvmet: failed to install queue 1 cntlid 1 ret 4006
[ 145.663147] nvmet: creating nvm controller 1 for subsystem delkvmsrv01 for NQN nqn.2014-08.org.nvmexpress:uuid:e9184000-7983-11ec-8000-d85ed34e19d9.
[ 158.440082] nvme 0000:21:00.0: Failed to get peer resource xrq=000000002642df80 be_ctrl=000000001efd7ba9
[ 176.894546] nvmet: creating nvm controller 1 for subsystem delkvmsrv01 for NQN nqn.2014-08.org.nvmexpress:uuid:e9184000-7983-11ec-8000-d85ed34e19d9.
[ 189.904877] nvme 0000:21:00.0: Failed to get peer resource xrq=000000002642df80 be_ctrl=000000001efd7ba9
[ 208.125623] nvmet: creating nvm controller 1 for subsystem delkvmsrv01 for NQN nqn.2014-08.org.nvmexpress:uuid:e9184000-7983-11ec-8000-d85ed34e19d9.
[ 221.037074] nvme 0000:21:00.0: Failed to get peer resource xrq=000000002642df80 be_ctrl=000000001efd7ba9
[ 239.356015] nvmet: creating nvm controller 1 for subsystem delkvmsrv01 for NQN nqn.2014-08.org.nvmexpress:uuid:e9184000-7983-11ec-8000-d85ed34e19d9.
[ 252.141791] nvme 0000:21:00.0: Failed to get peer resource xrq=000000002642df80 be_ctrl=000000001efd7ba9
[ 270.075432] nvmet: creating nvm controller 1 for subsystem delkvmsrv01 for NQN nqn.2014-08.org.nvmexpress:uuid:e9184000-7983-11ec-8000-d85ed34e19d9.
[ 282.453918] nvme 0000:21:00.0: Failed to get peer resource xrq=000000002642df80 be_ctrl=000000001efd7ba9
[ 300.283618] nvmet: creating nvm controller 1 for subsystem delkvmsrv01 for NQN nqn.2014-08.org.nvmexpress:uuid:e9184000-7983-11ec-8000-d85ed34e19d9.
[ 313.068466] nvme 0000:21:00.0: Failed to get peer resource xrq=000000002642df80 be_ctrl=000000001efd7ba9
[ 325.752641] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
[ 325.753098] nvmet: ctrl 1 fatal error occurred!
[ 660.610330] nvmet: creating nvm controller 1 for subsystem delkvmsrv01 for NQN nqn.2014-08.org.nvmexpress:uuid:e9184000-7983-11ec-8000-d85ed34e19d9.