How to directly transfer data between mlx5 and GPU's memory in RDMA

Dear support,

I have mlx5 NIC and PCIe GPU (non-Nvida, each GPU has12GB HBM).

The GPU’s HBM has been exposed to user space by mmap.

Now, I want to directly transfer data between my GPUs via the IB adapters like below:

App - (GPU - MLX5) ------ (MLX5 - GPU) - App

However, the mlx5_ib_reg_user_mr() always fail due to get_user_pages() return -14.

I tried to use remap_pfn_range() in my GPU driver’s mmap(), but get_user_pages() fails as VM_IO or VM_PFNMAP set.

I also tried to use fault()/vm_insert_mixed() in the mmap(), but get_user_pages() still fails as no “struct page”.

Is it possible to transfer data in my way just using PeerDirect, without using GPUDirect (CUDA)?

If not, could you guide me how to enable RDMA transfer on my PCIe devices?

Thanks.

Hi,

It is not possible to transfer data using PeerDirect, without using GPUDirect.

Please refer to Mellanox GPUDirect user manual, for system requirements (for example, CUDA), how to install GPUDirect RDMA & benchmark tests.

User Manual: https://www.mellanox.com/sites/default/files/related-docs/prod_software/Mellanox_GPUDirect_User_Manual.pdf

Regards,

Chen