I need to verify the feasibility of a system using a custom pci-express board with a FPGA onboard + some memory banks on it doing some frame grabbing. Each frame must be transferred to the video card to be used by cuda kernels. Each frame is really high resolution and is acquired at a couple hundreds of frames per second. I would need a really high throughput to achieve what I must do.
I was wondering if there had been some progress made on pci-express device to device memcpy made in newest release of cuda? The system runs on windows 7 and custom drivers are already implied to communicate with the FPGA board. I saw that GPUDirect exists but I’m not sure that it is what I’m looking for… Basically I’d like the data to flow in real time from the FPGA board memory banks to the GPU memory using DMA without using a CPU application.
Anyone has ideas about that? Is it something feasible?
I looked at the GPU for Video thing. It can only be used with Tesla video cards so I was a bit screwed since my system is working with GTX cards… Also, the DMA isn’t done directly from the FPGA side to the gpu memory. That’s all I know for now…