How to realize texture interop from CUDA to OpenGL in a multi-threaded & multi-GPU environment?

Hello,

can anyone give some guidance on how to implement CUDA(/OpenCL) data transfer to OpenGL textures in a heavily multi-threaded application?

Currently I have an application which handles the UI (OpenGL) in one thread and one or multiple threads per GPU for data processing via CUDA (normally 3 threads to make use of 3 streams to saturate host2gpu, kernel and gpu2host at the same time). As I haven‘t found a good example for connecting both worlds in such a multi-GPU/multi-thread scenario if I activate direct displaying of the data I transfer from the CUDA-context to RAM (in the thread of the stream that handled this data) and back to the (main) GPU (OpenGL Texture) in the UI thread. This of course slows the app down very much.

Not finding a good example/documentation I implemented in this dumb way knowingly but now I would like to move forward. But how to get started in getting direct transfer to OpenGL textures? What is the way to get those textures from the GUI-Context into the CUDA host-threads? Haven‘t found a good sample for OpenCL either by the way when I searched for some example detailing this use case.

Regards,
Ingmar

I believe the Mandelbrot sample renders to an OpenGL image buffer. Have you looked at it?