Can anyone tell me if a PCIe device can copy directly into GPU

I can’t seem to find any definitive answer on HW, OS, and CUDA version support for doing this. I see GPUDirect RDMA but this seems to be only for Linux. I have a Titan GPU card and don’t know if this is supported. I need this capability for Windows and my Titan card. I’m trying to get 40Gbs from FPGA to GPU over PCIe 3.0 X8. Does any one have experience with this, and how might Unified Memory in the upcoming CUDA 6 help?