After some time I’m rewriting a version of an QGLWidget that takes gpu memory and displays it with the help of the CUDA + OpenGL interoperability functions. As those changed a lot during the last versions I rebuilt this from scratch and now came to the problem that this only works if I call cudaGLSetGLDevice at the very beginning of my application. This is not a problem for my test app but for general usability of the widget, which will be included in a general library for cuda usage, it is. I don’t want to bother the user that a (dummy) OpenGL context has to be created and the GL device has to be set before any other work is done on the GPU. I don’t want to change the current card or something like that during processing but only use the current gpu registered cuda context to use the opengl interop stuff. Is there a nice and plain solution for that?
When I for example call cudaSetGLSetDevice while the init of OpenGL I get the message: “setting the device when a process is active is not allowed”
After some time I’m rewriting a version of an QGLWidget that takes gpu memory and displays it with the help of the CUDA + OpenGL interoperability functions. As those changed a lot during the last versions I rebuilt this from scratch and now came to the problem that this only works if I call cudaGLSetGLDevice at the very beginning of my application. This is not a problem for my test app but for general usability of the widget, which will be included in a general library for cuda usage, it is. I don’t want to bother the user that a (dummy) OpenGL context has to be created and the GL device has to be set before any other work is done on the GPU. I don’t want to change the current card or something like that during processing but only use the current gpu registered cuda context to use the opengl interop stuff. Is there a nice and plain solution for that?
When I for example call cudaSetGLSetDevice while the init of OpenGL I get the message: “setting the device when a process is active is not allowed”
In the meanwhile I got this thing to work on a single GPU setup. cudaGLSetGLDevice has to be called before everything. Even a printf before this set call givens an error. So now this works ok for a single gpu setup. In my desktop I have two graphics card. One for display and one for calculations. So when I do cudaSetGLSetDevice(0) (or 1 doesn’t matter) I get a segfault when I call some cuda stuff.
In the meanwhile I got this thing to work on a single GPU setup. cudaGLSetGLDevice has to be called before everything. Even a printf before this set call givens an error. So now this works ok for a single gpu setup. In my desktop I have two graphics card. One for display and one for calculations. So when I do cudaSetGLSetDevice(0) (or 1 doesn’t matter) I get a segfault when I call some cuda stuff.
The CUDA Runtime API is a bit too limited to let you handle what you’re asking (among many other limitations), you’re better off using the Driver API (the only (undocumented) limitation there is you have to create the CUDA context while an (any) OpenGL context is current on the calling thread - this is a bug in CUDA though, and shouldn’t really be a limitation at all?)
In multi-GPU systems though you’re going to encounter even larger flaws in CUDA…
a) You can’t do CUDA/GL interop when the CUDA context and the OpenGL context are on different devices (undocumented, and unsupported in my experience)
b) You can’t do GL device affinity on non-windows machines.
c) You can’t do GL device affinity on consumer devices (Quadro/Tesla only)
So either way you’re going to have a lot of problems/bugs/limitations of CUDA to work around… good luck!
Sad thing is, nVidia seemingly have no interest for points b/c - and between the lack of documentation in CUDA, and the bugs in the two APIs (both runtime and driver have different issues all over the place) - you’re not going to get a very stable or general purpose widget written… sadly :(