I am developing a graphic application using CUDA plus OpenGL. I’ve printed the CUDA’s Programming Guide and read it, but still I have many questions about how it works.
To operate with CUDA and OpenGL, I do this:
Create VBO in OpenGL (glGenBuffers…)
Register VBO in CUDA (cudaGLRegisterBufferObject…)
In each loop:
3.1 Map VBO in CUDA array (cudaGLMapBufferObject…)
3.2 Call CUDA kernels to modify VBO
3.3 Unmap VBO from CUDA array (cudaGLUnmapBufferObject…)
3.4 Render VBO
The object renders well, I have no problem about implementation, but I need to know how it works inside CPU and GPU. Please, could you answer these couple of questions?:
What does cudaGLRegisterBufferObject do exactly? Does OpenGL show to CUDA where is allocated VBO memory space?
In each loop, when application maps VBO into CUDA array with cudaGLMapBufferObject, what exactly does that mean? Is it a deep copy from CPU memory where is allocated VBO to GPU memory? Or is just a reference copy? I assume it is a reference copy because when I modify CUDA array the VBO shows the same changes, but I’m not pretty sure…
I’m worried about performance, so I need to know what happens inside my PC.
cudaGLRegisterBufferObject simply lets the OpenGL driver know that CUDA will be using this buffer object.
In the ideal case, cudaGLMapBufferObject just gets the address of the buffer object in video memory and passes it to CUDA. Sometimes (for example, if the buffer object is not resident in video memory) it may involve memory copies.
How can we assure that an openGL buffer is in the GPU’s memory? AFAIR there was no explicit memory management in openGL, was there? My knowledge of openGL is very rudimentary.
By calling cudaGLRegisterBufferObject, the internal memory management is handled by the CUDA library/driver. Simon was just letting you know that calling the function may result in such memory copies.
I understood that “may involve memory copies” means that it also may not involve them if a given buffer is already in device memory. Is this correct? If so, is there a way we can be sure (or at least have a good probability) that the buffer stays on the GPU through many kernel calls and cudaGLRegisterBufferObject calls?