example code with MVAPICH2-GPU

[font=“Arial”]Hi! Who have example code (possible for matrix multiplications) with cuda and with MVAPICH2-GPU? Possible anybody know another way of programming cuda with MPI that will be communications between GPUs without using CPU.

Help me, please! Thank you![/font]

I’d like to point you the APEnet+ interconnect which has GPUdirect support: