How can I copy OpenGL texture data efficiently to cuda?
Now I use OpenGL Buffers and call glTexImage. However this call takes very long (around 10 msec), so I assume the data is not directly copied but passes by the CPU (even if source and destation format of glTexImage is the same). This is a 1000 times slower compared to a device-to-device copy in CUDA, or even 2 times slower than a host-to-device copy.
So I would like to know if there is a way to copy the texture data from OpenGL directly to cuda on the video card.
Do you have multiple monitors connected to your GPU? Or do you have multiple GPUs? Then openGL interop will always be performed through the host. This is from the release notes by the way.
I have used the approach from the PostProcessGL example, but its performance is bad. As described this is probably because the copy is done through the host. I wondered if it is intended behaviour, or if the driver can perform format changes (as glTexImage offers) on the device.
@MisterAnderson42
Thanks for pointing me to this limiation, however when I request the number of available devices cuda finds only one device.
Also I have only one monitor attached.