I followed this doc:
Bring Up Ceph RDMA - Developer’s Guide https://community.mellanox.com/s/article/bring-up-ceph-rdma---developer-s-guide
But mon could not start with this error:
7f5acb890700 -1 Infiniband Device open rdma device failed. (2) No such file or directory
I checked ceph code:
116 name = ibv_get_device_name(device);
117 ctxt = ibv_open_device(device);
118 if (ctxt == NULL) {
119 lderr(cct) << func << " open rdm a device failed. "<< cpp_strerror(errno) << dendl;
120 ceph_abort();
121 }
Then
gdb info:
Breakpoint 1, Device::Device (this=0x55555f3597e0, cct=0x55555eec01c0, d=)
at /usr/src/debug/ceph-12.2.0/src/msg/async/rdma/Infiniband.cc:116
116 name = ibv_get_device_name(device);
$7 = {ops = {alloc_context = 0x0, free_context = 0x0}, node_type = IBV_NODE_CA, transport_type = IBV_TRANSPORT_IB, -------------------------------------------*(struct ibv_device *) device
name = “mlx4_0”, ‘\000’ <repeats 57 times>, dev_name = “uverbs0”, ‘\000’ <repeats 56 times>,
dev_path = “/sys/class/infiniband_verbs/uverbs0”, ‘\000’ <repeats 220 times>,
ibdev_path = “/sys/class/infiniband/mlx4_0”, ‘\000’ <repeats 227 times>}
Breakpoint 2, Device::Device (this=0x55555f3597e0, cct=0x55555eec01c0, d=)
at /usr/src/debug/ceph-12.2.0/src/msg/async/rdma/Infiniband.cc:117
117 ctxt = ibv_open_device(device);
Cannot access memory at address 0x646f6e2f305f3478 -------------------------------------------*(struct CephContext *) ctxt
It seems that ibv_open_device failed
ibstat
CA ‘mlx4_0’
CA type: MT26428
Number of ports: 1
Firmware version: 2.9.1000
Hardware version: b0
Node GUID: 0x0002c90300589efc
System image GUID: 0x0002c90300589eff
Port 1:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 33
LMC: 0
SM lid: 23
Capability mask: 0x0251086a
Port GUID: 0x0002c90300589efd
Link layer: InfiniBand
Is there any problem with the data of struct ibv_device?