Can I have one thread using OpenGL and another using CUDA and have them sharing data buffers? The XLockDisplay and GLXMakeCurrentContext should be called before cuGLCtxCreate/cuGLinit. There should be application level locks on any app data and the opengl thread shouldn’t write to a mapped cuda buffer. Is that all?
I have something similar working now, but as far as I remember, I couldn’t get cuda buffer to share over thread-boundaries (at least with CUDA 1.1), so I just copy the data through a host-pointer (I have quite a small amount of data as input), but I also tried it only with the runtime-API, but I suppose the rules will be the same.
I’m not sure how host context-switch affects device contexts, but if the usual OpenGL context-handling paradigm is used (as should), then all data is context sensitive (including these buffers), and a context can be current to only one (host-side) thread at a time, which would make your use-case of direct sharing across thread-boundaries impossible.
Also regarding my solution, I started wondering if device-to-device cudamemcpy would be possible over two different contexts - not sure, but could be worth checking from the spec, in case you have a larger buffer.
Thank you for your insight. Looks like one thread for all. Each GLX context needs a Display, the Display should be locked on any calls. Cuda context needs a GLX context.
I assume two different CUDA contexts don’t share memory, but I’m unsure. Here’s another idea: two GLX contexts, one with Cuda and the other that does regular opengl calls. The context uses the shared parameter of glxcreatecontext so they share texture ids. Now I can maybe use textures (copy to buffer in cuda context) to share data. I don’t know if it’s worth it though, I don’t trust it since I don’t know the internals of the NVIDIA drivers.