I have a problem where the calculations and the visualization needs one graphic card each. Is there an easy way to send data between two graphic cards in CUDA, or do I have to send it to the host first and then to the other GPU?
You will have to use the host to copy from the CUDA device to host memory and then to the visualisation card. Depending on how much data there is, you might want to experiment with zero copy memory from the CUDA device, if you card supports it.
Ok, if I want to do visualization through OpenGL on the same card as the calculations, is it easy to visualize an array (or texture) of data as a volume? How hard is it to visualize two volumes at the same time?