Peerdirect Cuda <-> Ethernet NIC

Hello guys,

I wanted to test out highspeed p2p sending and receiving directly into the GPU’s memory from an enabled ethernet NIC.

I understand there has to be a NIC driver supporting this. Mellanox is providing Gpudirect RDMA support which seems to be including what I want and more. They copy GPU memory to GPU memory in another host.

As a first test application, I had a packet generator and packet echo in mind. What I want to achieve is having the data plane and the control plane managed by the GPU without copies to the host memory.

I reviewed a few projects (GPUnet, GPUrdma). The most promising seems to be GPUrdma where they have altered drivers to get access to the nic control registers. They are talking about “doorbell” registers which is maybe infiniband specific I think.
I discovered that mellanox added a firmware function to put doorbell registers into user accessable memory regions for the Connect-X4 card and above. It talks about a “No Driver Nic (NODNIC)” mode.

For my project, the NIC does not need to be available to the linux operating system at the host. It would be sufficient if packet sending and packet receiving could take place directly from the GPU without CPU intervention.

Could anyone confirm that my project is technically viable and maybe point me to an existing project where this technic is used?

Many thanks!