DMA transfers over Ethernet/PCIe - is it possible?

We are using a GigE vision camera (PtGrey Blackfly 5MP) with the Jetson. We see that with USB3 cameras we can use DMA transfers to offload image processing to the GPU, thus freeing up our CPU for other tasks. But over PCIe/Ethernet we have not been successful in doing the same thing, and we begin to overload the CPU.

The block diagram of the Tegra K1 is a little ambiguous on whether using DMA for PCIe transactions is possible. Can anyone confirm or refute this as a possibility?

Jetson TK1’s gigabit RTL8111 PCIe ethernet controller includes the DMA engine, which it already utilizes for transferring packets over PCIe, and is already integrated with IP sockets layer in Linux.

What I think you’re getting at, is using ZeroCopy to avoid extra memory copy between memory used by sockets and memory used by CUDA. See this post Pass the pointer allocated with cudaHostAlloc(cudaHostAllocMapped) to PtGrey’s API (hopefully they include a mechanism for this). Then the image should be captured into the CUDA-mapped memory directly.