Dear All,
I would like to kindly ask you some information regarding an application that we are developing. I hope I am writing in the correct forum, as the topic cover various areas.
We have a Xilinx FPGA that acquire a set of images and we would like to transfer these images directly into the GPU memory with DirectGPU for subsequent elaboration with Python Cupy library based on CUDA.
We have found this application principle for the transfer between FPGA to GPU memory using DirectGPU:
I found that Cupy initializes the GPU array with cudaMalloc and then returns the virtual memory pointer. In addition, seems to be possible to use cp.cuda.UnownedMemory(ptr, size, owner=None) to access a memory region not allocated by Cupy. However, since the virtual memory pointer is required, my main concern is whether the RDMA works with virtual memory or physical memory, and whether it will be possible to obtain the virtual memory pointer to manage the data with Python Cupy?
Our idea is to transfer a single block of data; does RDMA expect to work with a single block or chunk?
How do RDMA works?
Do you have any suggestions for our application?
Thank you very much in advance for your time and help.
Kind regards
Alessandro
