I’m still trying to understand what is the actual difference between these two calls?
We have either
ibv_reg_mr
or
ibv_reg_dmabuf_mr
But how does it impact the data flow between the GPU and NIC? Can you please explain it based on the PCI topology? From my current understanding, the GPU and NIC share a pinned memory block on host memory filled with virtual addresses through which they communicate with each other. Afaik this is the regular implementation for ibv_reg_mr when using GPU memory. But what exactly happens when using ibv_reg_dmabuf_mr?
dmabuf however is not supported by my GPU (RTX A5000) as noted by perftest > ib_write_bw via the following code:
We finally reached the point to implement DMA-BUF. From what I can tell: There is no performance difference so far. It is just the modern and recommended way to access the BAR regions from other PCIe devices; an open source framework implemented in Linux and used by NVIDIA instead of the proprietary nvidia_peermem module which was used and responsible for VA/PA mapping.