How to send the CUDA results to OpenGL texture?

We have an CUDA application, in which we want to pass the result directly to display card and visualize it, instead of shipping them back to host memory and using standard OpenGL methods to display them.

So we created a PBO buffer object, map it to a global pointer so that our CUDA kernel software can put their hands on this buffer. After kernel computing, our results are stored on this PBO buffer.

But we have hard time to pass these results directly from PBO buffer to texture memory, by which we plan to use glTexSubImage2D to realize the direct data transfer from Device to Device.

What have been observed is that, on Emulater, it works but on device, it doesn’t. We are sure that the results stored on the PBO buffer is correct even in debug/release mode (we copy the result from PBO buffer to host memory and therefore make a check).

Given that we already have the result in PBO buffer, the following is our visualization code:

glViewport(0, 0, projection_width, projection_height);

// set up projection matrix
glOrtho( 0, projection_width, 0, projection_height, -1000, 1000);

// Draw textured rectangle
glEnable( GL_BLEND );
glEnable( GL_TEXTURE_2D );

  CUDA_SAFE_CALL( cudaGLUnmapBufferObject(gl_PBO) );
CUDA_SAFE_CALL( cudaGLUnregisterBufferObject(gl_PBO) );

// glBindTexture( GL_TEXTURE_2D, gl_Tex );
glTexImage2D( GL_TEXTURE_2D, 0, GL_RGB, projection_width, 				projection_height, 0, GL_RGB, GL_UNSIGNED_BYTE, pbo_buffer );

glBegin( GL_QUADS );

	glTexCoord2f(0,0 );	glVertex2f( 0, 0 );
	glTexCoord2f(1,0 );	glVertex2f( width, 0 );
	glTexCoord2f(1,1 );	glVertex2f( width, height );
	glTexCoord2f(0,1 );	glVertex2f( 0, height);



By the way, from the FAQ of CUDA forum, we learn that CUDA does not allow kernel code access OpenGL texture memory directly. Instead, we can make a copy of the texture memory and write it back after modification. I saw there is functions called “cudaBindTextureToArray” in the sample codes. But this function dose not seem to get a pointer of the texture memory. Because in those demo codes, this function always bind texture to a CudaArray which is created and molloced memory by user.

So is there a function by which we can put our hand on the real texture memory and use cudaMemcpy (PBO buffer, Texture memory, cudaDeviceToDevice) to directly visualize the result?


Have you looked at the boxfilter and postProcessGL examples in the SDK? These both display data generated in CUDA using OpenGL.

You need to bind the PBO before you make the glTexImage call so that it loads the texture data from the PBO. You also need to specify zero as the data pointer to indicate that you want to load data from the PBO rather than the usual host memory pointer.


glTexImage2D( GL_TEXTURE_2D, 0, GL_RGB, projection_width, projection_height, 0, GL_RGB, GL_UNSIGNED_BYTE, 0 );

I recommend reading the pixel buffer object spec if you haven’t done so:…ffer_object.txt

Hi! Please help me.

I use CUDA with OpenGL for some image processing task in such way:

  1. create source and destination buffer objects
  2. read pixel from frame buffer to source buffer object
  3. process source buffer and put result in destination buffer
  4. than I need to draw result into the frame buffer again

I saw example postProcess, but time that need for make texture from destination data is to slow (glTexImage2d), on our hardware - 3ms.

Can I use new G80 OpenGL extension GL_EXT_texture_buffer and TEXTURE_BUFFER_EXT texture type for assigning data from destination buffer object to that texture type and mapping that texture in usual way to polygon vertices. If it is possible, how I do that?

Thank you!