I’m building a system to transfer data between the FPGA and GPU server via PCIe interface.
- First, I write data from FPGA to the Host memory using the PCIe-DMA engine.
- Then, I copy data from Host memory to GPU memory using cudaMemcpyAsync() for further processing.
The problem is how the GPU program knows when the data is ready in Host memory to start copying and processing.
Currently, I can generate and send an interrupt from FPGA to the Linux PCIe kernel driver to notify the Host that data is coming. However, from the kernel, I cannot call a CUDA function to copy and process the data.
I’m a newbie in CUDA Programming. Any comments and suggestion will be appreciated.
The usual method to do this would be to create a CUDA stream, launch the
cudaMemcpyAsync into the stream, and then launch the kernel that is going to process the data into the same stream. The kernel will not begin processing until the data copy (
cudaMemcpyAsync previously issued into the same stream) is complete.
If you intend to launch the kernel prior to the data copy operation, then the kernel code will need to look at a location in memory and wait until some sort of value is written there. This is fairly complicated to get right, and not something I would recommend for a newbie.
If your question is how to send a signal from linux kernel space to linux user space, that is not really a CUDA programming question. You might also want to study how it is done with a GPUDirect RDMA implementation
I got it. Thanks for your advice.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.