Problem with cudaGLMapBufferObject

Hello everybody,

I have a serious problem with mapping a VBO from OpenGL to CUDA. That’s the code:

GLuint vbo;
cudaError_t err;
float4 *dptr;

glewInit();

// create buffer object
glGenBuffers(1, &vbo);
glBindBuffer(GL_ARRAY_BUFFER, vbo);

// initialize buffer object
unsigned int size = 256 * 256 * 4 * sizeof( float);
float data = (float)malloc(size);
glBufferData(GL_ARRAY_BUFFER, size, data, GL_DYNAMIC_DRAW);
free(data);

glBindBuffer(GL_ARRAY_BUFFER, 0);

err = cudaGLRegisterBufferObject(vbo);

err = cudaGLMapBufferObject((void**)&dptr, vbo);

// Call a kernel

err = cudaGLUnmapBufferObject(vbo);

err = cudaGLUnregisterBufferObject(vbo);

That’s almost identical with the simpleGL example from the SDK. All error codes are cudaSuccess but after this code is executed my computer hangs for a few seconds. No interaction at all possible. Then I get back control for a moment but shortly after it hangs again. Also my OpenGL viewport (interactive 3D walkthrough) is totally messed up. My scene is rendered with flickering artifacts or not all. When I comment out cudaGLMapBufferObject and cudaGLUnmapBufferObject everything works fine. Also the SDK samples run perfectly good. I’m really desperate and out of ideas. Is there any way to find out what’s going wrong?

Thanks for any help in advance.

Best regards,
Marco

My system:

Intel Core2 Duo
4 GB RAM
8800 GTX
CUDA 1.0
MS VS 2005

You don’t need to register and unregister the VBO each time, just register it once at the beginning.

If your machine is hanging I suspect there is an error in your kernel. Does it run correctly in emuDebug?

The register and unregister calls are just for showing the sequence of events. I actually don’t call any kernel right now, it’s really just mapping and unmapping. But I also had a dummy kernel running and the same thing happened too. The problem is in EmuDebug as well as in Debug the same.

I just looked at the OpenGL states with GLIntercept in my application and compared them with the simpleGL sample. There are some interessting differences that I don’t understand.

My app:

–snip–
glGenBuffers(1,0x12f4f0)
glBindBuffer(GL_ARRAY_BUFFER,1)
glBufferData(GL_ARRAY_BUFFER,1048576,0x6820040,GL_DYNAMIC_DRAW)
glBindBuffer(GL_ARRAY_BUFFER,0)
wglGetProcAddress(“glBindBufferARB”)=0x1002a0b0
wglGetProcAddress(“glMapBufferARB”)=0x1002a290
wglGetProcAddress(“glGetBufferParameterivARB”)=0x10032690
glBindBufferARB(GL_ARRAY_BUFFER,1)
glGetBufferParameterivARB(GL_ARRAY_BUFFER,GL_BUFFER_SIZE,0x12f498)
glMapBufferARB(GL_ARRAY_BUFFER,GL_READ_WRITE)=0x6820000
wglGetProcAddress(“glBindBufferARB”)=0x1002a0b0
wglGetProcAddress(“glUnmapBufferARB”)=0x1002a2f0
glBindBufferARB(GL_ARRAY_BUFFER,1)
glUnmapBufferARB(GL_ARRAY_BUFFER)=true
–snip–

simpleGL:

–snip–
glGenBuffers(1,0x472124)
glBindBuffer(GL_ARRAY_BUFFER,1)
glBufferData(GL_ARRAY_BUFFER,1048576,0x4620040,GL_DYNAMIC_DRAW)
glBindBuffer(GL_ARRAY_BUFFER,0)
wglGetProcAddress(“m5d8a7sk”)=0x5cba50
wglGetProcAddress(“u69b8a7d”)=0x5cbab0
wglGetProcAddress(“glGpuSyncGetHandleSizeNVX”)=0x5cbb10
wglGetProcAddress(“glGpuSyncInitNVX”)=0x5cbb70
wglGetProcAddress(“glGpuSyncEndNVX”)=0x5cbbd0
wglGetProcAddress(“glGpuSyncMapBufferNVX”)=0x5cbc30
wglGetProcAddress(“glGpuSyncUnmapBufferNVX”)=0x5cbc90
wglGetProcAddress(“glGpuSyncCopyBufferNVX”)=0x5cbcf0
wglGetProcAddress(“glGpuSyncAcquireNVX”)=0x5cbd50
wglGetProcAddress(“glGpuSyncReleaseNVX”)=0x5cbdb0
glGpuSyncGetHandleSizeNVX( ??? )
glGpuSyncInitNVX( ??? )
glGetError()=GL_NO_ERROR
m5d8a7sk( ??? )
u69b8a7d( ??? )
glGetError()=GL_NO_ERROR
m5d8a7sk( ??? )
glGpuSyncReleaseNVX( ??? )
glGpuSyncAcquireNVX( ??? )
u69b8a7d( ??? )
glBindBuffer(GL_ARRAY_BUFFER,1)
glMapBuffer(GL_ARRAY_BUFFER,GL_READ_ONLY)=0x4630000
glUnmapBuffer(GL_ARRAY_BUFFER)=true
–snip–

In simpleGL there seem to be more calls like glGpuSyncGetHandleSizeNVX for the same peace of code. It also looks like it doesn’t use the wglGetProcAdress call for getting the ARB extensions. Could there be a problem?

I have found the problem. The OpenGL error flag in my application was set to GL_INVALID_OPERATION somewhere else in my code. cudaGLRegisterBufferObject seems to check the flag and is interrupted. Unfortunatelly it doesn’t return an error itself but GLIntercept logged it. I just called glGetError() before registering/mapping my VBO and thereby cleared the flag. Now everything runs smooth.

Best regards,
Marco