We have an CUDA application, in which we want to pass the result directly to display card and visualize it, instead of shipping them back to host memory and using standard OpenGL methods to display them.
So we created a PBO buffer object, map it to a global pointer so that our CUDA kernel software can put their hands on this buffer. After kernel computing, our results are stored on this PBO buffer.
But we have hard time to pass these results directly from PBO buffer to texture memory, by which we plan to use glTexSubImage2D to realize the direct data transfer from Device to Device.
What have been observed is that, on Emulater, it works but on device, it doesn’t. We are sure that the results stored on the PBO buffer is correct even in debug/release mode (we copy the result from PBO buffer to host memory and therefore make a check).
Given that we already have the result in PBO buffer, the following is our visualization code:
glViewport(0, 0, projection_width, projection_height);
// set up projection matrix glMatrixMode(GL_PROJECTION); glLoadIdentity(); glOrtho( 0, projection_width, 0, projection_height, -1000, 1000); // Draw textured rectangle glEnable( GL_BLEND ); glEnable( GL_TEXTURE_2D ); CUDA_SAFE_CALL( cudaGLUnmapBufferObject(gl_PBO) ); CUDA_SAFE_CALL( cudaGLUnregisterBufferObject(gl_PBO) ); // glBindTexture( GL_TEXTURE_2D, gl_Tex ); glTexImage2D( GL_TEXTURE_2D, 0, GL_RGB, projection_width, projection_height, 0, GL_RGB, GL_UNSIGNED_BYTE, pbo_buffer ); glBegin( GL_QUADS ); glTexCoord2f(0,0 ); glVertex2f( 0, 0 ); glTexCoord2f(1,0 ); glVertex2f( width, 0 ); glTexCoord2f(1,1 ); glVertex2f( width, height ); glTexCoord2f(0,1 ); glVertex2f( 0, height); glEnd();
By the way, from the FAQ of CUDA forum, we learn that CUDA does not allow kernel code access OpenGL texture memory directly. Instead, we can make a copy of the texture memory and write it back after modification. I saw there is functions called “cudaBindTextureToArray” in the sample codes. But this function dose not seem to get a pointer of the texture memory. Because in those demo codes, this function always bind texture to a CudaArray which is created and molloced memory by user.
So is there a function by which we can put our hand on the real texture memory and use cudaMemcpy (PBO buffer, Texture memory, cudaDeviceToDevice) to directly visualize the result?