Does anybody know wether it is possible to use multiple host threads with runtime api?
I have the following problem.
I create an OpenGL texture, then I register and map it to the address space of CUDA. This works fine so far. But if the actual host thread is terminated and another thread is created doing the same procedure for some reason that fails,
and even worse: it freezes my computer.
the textures are defined in a seperate thread
here some snippets of the code that is executed by the host threads:
wglMakeCurrent(m_hDC, m_hRC);
// the next lines make my computer freeze when executed by the second host thread
// after the first host thread is destroyed
cudaGLRegisterBufferObject(m_pbuffer);
cudaGLMapBufferObject((void**)&data, m_pbuffer);
runCuda(…); // this is the function wich is doing some processing
cudaGLUnmapBufferObject(m_pbuffer);
cudaGLUnregisterBufferObject(m_pbuffer);
SwapBuffers(m_hDC);
wglMakeCurrent(NULL, NULL);
After the first thread ist destroyed the OpenGL texture becomes inaccessible by other host threads.
Each thread gets its own CUDA context. When you terminate the thread where you initialized CUDA, it automatically cleaned up all the resources used by it on the GPU. Even if you do not terminate that thread, device pointers, texture references and other resources are specific to that context and cannot be shared with another (like protected memory on the CPU).
yes, that’s right. But for the new thread the OpenGL texture or more it’s buffer is also mapped to the actual thread’s address space. So far this must be okay, or let me say it this way: I haven’t found any sentence in the CUDA manual why it should not work.
with a single device, is it possible to have opengl and cuda operate in two different threads?
What I want to do is create two pixel buffers in opengl, map one of them into the cuda context to be filled by a kernel, while the other is displayed with opengl.
As I expected I get errors when I call the cudaGLRegisterBufferObject() and
cudaGLMapBufferObject() functions.
You want to create two pixel buffers by OpenGL in one host thread and processed one of them at one time by CUDA which is operating in another host thread, right?
I think that should work, but I’m not quite sure truly said.
Do you make a call to wglMakeCurrent before cudaGLRegisterBufferObject is called?
Sorry I cannot help you more, but I have none CUDA capable equipment anymore.
That sounds about right. But let me be a bit more specific…
Actually I simply want to download and process an image with CUDA and subsequently display it with OpenGL. However, the CUDA stuff takes way too long and for some reason doesn’t run asynchronously. Since I don’t want to block the main render thread, I would like to run the CUDA stuff in a different thread. I don’t think it’s possible to map an OpenGL PBO from my main render thread into the CUDA context associated with my CUDA worker thread. Hence, I thought it might work if I created another OpenGL context in the CUDA worker thread and have it share a display list with the main OpenGL context.
This way I can create a PBO in the OpenGL context that runs in the same thread as CUDA and it should be able to map the PBO into the CUDA context. But additionally I should be able to access the PBO subsequently from my main render thread and draw its contents…
Does this sound feasible?
And do I really need to have two OpenGL contexts in this scenario?