in our CUDA-Enabled application, we use OpenGL VBO to pass data between CUDA kernel and OpenGL rendering. with CUDA2.2 driver, the following code run smoothly.
...
unsigned int vbo;
unsigned int size = 1024;
glGenBuffersARB(1, &vbo);
//claim vbo size, but pass null data.
glBufferDataARB(GL_ARRAY_BUFFER_ARB, size , NULL, GL_DYNAMIC_DRAW_ARB);
//register vbo to CUDA first
cudaGLRegisterBufferObject(vbo);
...
//fill vbo with real data later on
glBindBufferARB(GL_ARRAY_BUFFER_ARB, vbo);
glBufferSubDataARB(GL_ARRAY_BUFFER_ARB, realdata);
glBindBufferARB(GL_ARRAY_BUFFER_ARB, 0);
...
But it did introduce potential problem under CUDA2.3 and corrupted the following calls to cudaMalloc/cudaMemcpy.
we fixed this problem by moving glBufferSubDataARB before cudaGLRegisterBufferObject.
CUDA Progamming guide hasn’t metioned the order of these operations, so I assume this is CUDA internal bug.
regards,
-Jacques