Interfacing the GPU from camera/ other devices

I am experimenting with using CUDA to speed up image processing tasks in a much larger system. The input image arrives from a camera at a rate of a few hundreds MB a second. Right now, we copy the image from the hardware to the host, than to the GPU. This is a LOT of bandwidth that goes to waste.
I guess that asking a camera link in the card is too much… but what about writing the image from the acquisitor directly through the PCI-Express NOT through the host memory? theoretically PCI-Express is P2P and enables it.

Insights anyone?

Right, theoretically. In the real world, you should access GPU memory areas by instructing DMAs through NV driver, that actually doesn’t expose directly neither global nor texture memory in pre-defined memory locations (nothing but framebuffer in Gef8 and stereobuffer in Quadro). And even if you had access to these informations, your access to that memory couldn’t be protected against other accesses (as you could overlap your datas to other ones).

I think it should be a hard thing to deal with…