Direct inter-GPU communication isn’t supported in CUDA (or any of the other programmable shader API AFAIK), and the SLI link doesn’t function with CUDA either.
Just curious where you got the information about the SLI link being 1GB/sec. I always thought it was much slower than that, but I’ve never seen anyone’s actual measurement or documentation. Certainly there’s no tech specs from NVIDIA.
I am also very curious about the 1GB/sec spec on the SLI Bridge. I wasn’t sure if the bridge just sent check bits back and forth and not full packets of information. When I was at GTC2010 I asked this question to the Nvidia Engineers on why they don’t let programmers use the SLI bridge in CUDA. They just said it doesn’t support it and no further info. Does anyone know if in straight OpenGL shader code you can access the SLI Bridge? If so then can we write shader code for our GPGPU problems bypassing CUDA, and get access to the SLI Bridge?
My understanding is that the SLI bridge connector basically just transmits digital video so that scan-out can occur from either card’s video memory. It can’t be used for general purpose data transfer. All other transfers necessary for SLI graphics go over the PCIe bus.
There are some improvements in this area in the next CUDA release, so stay tuned.