So I am trying to implement a new version of the CUDA Runtime API in Ocelot using the CUDA Driver API 3.0 and am having problems with opengl contexts and cuGLCtxCreate. My first idea was to try to create an opengl context for every application, and then fall back on cuCtxCreate if cuGLCtxCreate failed. However, cuGLCtxCreate segfaults (rather than returns an error) if it is called before glInit() or glutInit() in the host application.
My first question is whether or not this is how cuGLCtxCreate is supposed to work? It seems kind of fishy for an api call to segfault like this.
To get around this, I tried lazily allocating an opengl context on the first open-gl related cuda call. This works for some simple applications (all of the cuda sdk except for volumerender), but it fails in cases where some resources were allocated on a regular context, an opengl buffer was allocated on another context, and a kernel accesses both. This is because both contexts cannot be active at the same time, and resources cannot be shared between contexts.
My only recourse at this point I think is to find some way of having multiple contexts active at the same time (probably not possible), or manually migrating state from one context to the new one when an opengl call is made (difficult),