OpenGL/Cuda interop with multi-gpu environment

We’ve come across a problem in our company and we couldn’t find any answer in the Cuda documentation. We’ve a software that needs to ensure best response times and all the pipeline is OpenGL/Cuda based. The software is doing OpenGL(glsl) computations as well as Cuda computations. The OpenGL pipeline is critical and we need to maintain it at 25 fps. The Cuda computations are quite heavy that’s why we deported them on another GPU ( Geforce 580 GTX).
The OpenGL pipeline however is using a Quadra 5800 FX.
Our application uses a separate thread to handle the Cuda computations and when these computations need to be made, as of right now, we copy the data stored (in form of OpenGL textures) to the RAM, to forward it to the Cuda context held by the other GPU.

First question, in an Nvida slide show, it is stated some graphic cards uses PCI express instead of the RAM to communicate between openGL/Cuda. When does this happen ? Could you provide us a list of such graphic cards? As I think it would make a severe difference if we wouldn’t need to use the system RAM at all.

Same question if it was only 1 cudaGL context living in one thread, is there any way to know when the GPU uses the RAM to transfer data from OpenGL/Cuda ?

kind regards,