Describe the bug
we are configuring ”nvme over rdma“.
We can use
nvme discover to discover remote nvme normally.
But when use
nvme connect,we met an error.
by using linux command dmesg, get information
[5365309.262528]mlx5_cmd_check:810:(pid 923941): create_mkey(0x200) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x35e6ec)
[5365309.262539]nvme nvme1: failed to initialize pi mr pool sized 128 for qid 1
[5365309.262551] nvme nvme1: rdma connection establishment failed (-22)
We followed the above command, but we met an issue,when
I am also facing the same problem when configuring NVMeOF.
GPUDirect Storage without network (just NVMe) works well.
My Linux kernel version is 5.4.0-100-generic with Ubuntu 20.04.2 LTS. I both tried MLNX_OFED_LINUX-5.8-188.8.131.52-ubuntu20.04-x86_64 and MLNX_OFED_LINUX-5.4-184.108.40.206-ubuntu20.04-x86_64 and didn’t work.
My version is the same as your Linux kernel version，have you tried other version？
No. I have only tried changing MLNX_OFED versions (which was not the solution).
which MLNX &kernel version did you deploy ? didyou successed?
I configured according to the official url, but I encountered the same problem and could not solve it. Do you know how to solve it