Direct access to frame buffer?

Sorry for the newb questions…

  1. Can I get direct access to the frame buffer from CUDA code? There was a similar question a couple of years ago, and the answer was ‘no’, but this seems so bizarre that I thought I’d ask it again to make sure that nothing’s changed.

  2. If I can’t get direct access, then presumably I’m stuck with a 2-stage copy: (a) an OpenGL call (I only started on OpenGL yesterday - no idea exactly how to do this) to read the buffer out to the CPU, and then (b) a cudaMemcpy to get the data back out to the graphics card?

  3. Does anyone have any idea on the overall streaming data rate I’d expect when using this 2-stage copy to read the frame buffer?

  4. I want my kernel to continuously read the frame buffer and process it. Is this even possible, given that the host CPU has to initiate the buffer transfer out of the graphics card into the host memory space?

Thanks -


I think you’re better of using one of these methods:

For reading the default framebuffer data, you can simply call glReadPixels with a buffer object bound to GL_PIXEL_PACK_BUFFER and afterwards map that buffer object for use in cuda.

for writing:

Write your cuda data in a mapped opengl buffer object
Write buffer data to default framebuffer using glDrawPixels with a buffer object bound to GL_PIXEL_UNPACK_BUFFER.


  1. Write your cuda data in a mapped opengl buffer object
  2. call glTex(Sub)image to copy the buffer data into a texture
  3. use FBO-blitting to blit the texture data to the default framebuffer ( or just render a textured quad…)

Last method requires an extra copy, but it seems like drawpixels is marked as deprecated in GL3.X

In all of the above methods, the data remains on the GPU and there are no transfers between CPU and GPU, so they should be pretty fast.


Thanks (I don’t need to write); I hadn’t realised that these buffer objects remained on the graphics card.

However, I’ve now had a chance to play with OpenGL a bit more, and (I think) realised something which should have been obvious: glReadPixels only reads the current window, and not the physical frame buffer. This app needs to process the entire physical frame buffer (everything that results in a displayed pixel on the monitor).

Does anyone know if this is possible from OpenGL? Or DirectX (I’m just starting to read the DirectX docs?) Or at a physical level from the CUDA code, if I can find the address of the physical frame buffer?

Actually, is this even possible, if the graphics card supports overlays? I only have to support nVidia cards, and I’m currently assuming that there’s a real linear memory region that maps directly to the displayed pixels, but I’m not sure about this.

Thanks -