cudaErrorLaunchFailure with PBO mapping


I have a problem with mapping/unmapping a PBO from OpenGL to CUDA. I did everything as in the postProcessGL example but for some reason I get a cudaErrorLaunchFailure after calling cudaGLMapBufferObject. I haven’t called any kernel before, I just create the PBO, register and map it. I don’t even call a kernel between the mapping/unmapping. The weird thing is that the error doesn’t show up until the program (OpenGL scene walkthrough) runs for about 15-45s. The PBO is mapped/unmapped in every frame so the problem only shows up after a few hundred/thousands rendered frames. What problems can generally trigger a cudaErrorLaunchFailure except a bad kernel call? Maybe a bad OpenGL state?

My System:
Core 2 Quad
8800 GTX
Cuda 2.0