Cannot deregister memory region(ibv_dereg_mr) if mr size reach 2GB

MLNX_OFED version:
MLNX_OFED_LINUX-5.4-3.0.3.0-ubuntu20.04-x86_64

System:
Ubuntu 20.04

When mr size reaches about 2GB:

  • ibv_dereg_mr(mr) gets stuck(no return, and process hangs)
  • kill -9 cannot immediately kill the process, htop shows the process occupying 100% CPU
  • ibv_devinfo shows “failed to open device”

However, mr = ibv_reg_mr() still works well, I can even do rdma operations when ibv_dereg_mr not called.
When mr size not reaching 2GB, about 2045MB, it also works well.

Code Example:

// mem_ptr points to mmaped 2M hugepages, dereg problem occurs when mem_sz reaches 2GB.
auto mr = ibv_reg_mr(pd, mem_ptr, mem_sz, IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_WRITE | IBV_ACCESS_REMOTE_ATOMIC);

// When sleeping, everything works well, ibv_devinfo also shows the device
int count = 30;
while (count > 0) {
count–;
sleep(1);
LOG(2) << "sleeping… mr size: " << mr->length;
}

{
auto rc = ibv_dereg_mr(mr); // When problem occurs, ibv_devinfo prints “failed to open device”
LOG(2) << “dereg mr”; // When problem occurs, process stucks and this line is not printed
LOG_IF(2, rc != 0) << "dereg mr error: " << strerror(errno);
}