ceph + rdma error: ibv_open_device failed

I followed this doc:

Bring Up Ceph RDMA - Developer’s Guide https://community.mellanox.com/s/article/bring-up-ceph-rdma---developer-s-guide

But mon could not start with this error:

7f5acb890700 -1 Infiniband Device open rdma device failed. (2) No such file or directory

I checked ceph code:

116 name = ibv_get_device_name(device);

117 ctxt = ibv_open_device(device);

118 if (ctxt == NULL) {

119 lderr(cct) << func << " open rdm a device failed. "<< cpp_strerror(errno) << dendl;

120 ceph_abort();

121 }

Then

gdb info:

Breakpoint 1, Device::Device (this=0x55555f3597e0, cct=0x55555eec01c0, d=)

at /usr/src/debug/ceph-12.2.0/src/msg/async/rdma/Infiniband.cc:116

116 name = ibv_get_device_name(device);

$7 = {ops = {alloc_context = 0x0, free_context = 0x0}, node_type = IBV_NODE_CA, transport_type = IBV_TRANSPORT_IB, -------------------------------------------*(struct ibv_device *) device

name = “mlx4_0”, ‘\000’ <repeats 57 times>, dev_name = “uverbs0”, ‘\000’ <repeats 56 times>,

dev_path = “/sys/class/infiniband_verbs/uverbs0”, ‘\000’ <repeats 220 times>,

ibdev_path = “/sys/class/infiniband/mlx4_0”, ‘\000’ <repeats 227 times>}

Breakpoint 2, Device::Device (this=0x55555f3597e0, cct=0x55555eec01c0, d=)

at /usr/src/debug/ceph-12.2.0/src/msg/async/rdma/Infiniband.cc:117

117 ctxt = ibv_open_device(device);

Cannot access memory at address 0x646f6e2f305f3478 -------------------------------------------*(struct CephContext *) ctxt

It seems that ibv_open_device failed

ibstat

CA ‘mlx4_0’

CA type: MT26428

Number of ports: 1

Firmware version: 2.9.1000

Hardware version: b0

Node GUID: 0x0002c90300589efc

System image GUID: 0x0002c90300589eff

Port 1:

State: Active

Physical state: LinkUp

Rate: 40

Base lid: 33

LMC: 0

SM lid: 23

Capability mask: 0x0251086a

Port GUID: 0x0002c90300589efd

Link layer: InfiniBand

Is there any problem with the data of struct ibv_device?

You must be using systemd? just follow this link msg/async/rdma: support RDMA in systemctl by Adirl · Pull Request #13305 · ceph/ceph · GitHub msg/async/rdma: support RDMA in systemctl by Adirl · Pull Request #13305 · ceph/ceph · GitHub

Luminous does not support latest Ceph RDMA code. What version of Ceph are you using? Also, if you can please provide the ceph.conf configuration. Lastly, are you able to run other tool on this node like ib_write_bw?