RDMA from local host memory to remote GPU memory?

Is it possible to RDMA from localhost memory to the memory of a remote GPU? I know GPUDirect RDMA can do remote GPU to GPU transfers but I haven’t seen any indication of whether RDMA from local host memory is possible?

I don’t know the context of your question, but what is wrong with the cuda API and using pinned memory (if it makes sense) will allow the driver to perform the copy from host-to-dev for you?

The benefit of using RDMA is that a different device, say an FPGA, can push data directly to the GPU without having to get the CPU involved.

Or are you saying a third-party device would pull from host memory and push to GPU memory? Sure you could make that work; would be curious to know the use-case / scenario you have in mind.