Hi, I am writing a program that use OpenGL to render a 3D scene and use CUDA to count the number of pixels having a certain value. Could anyone advise me how to access the color buffer from within CUDA program? Thanks!
you can’t directly. I haven’t tried it this way (only the other way, cuda to gl), but I guess you can create an OpenGL pixel pack buffer into which you read the contents of the framebuffer, which you then let cuda bind to. Then you have a device pointer to the buffer, which you can give your cuda application.
Note that, if I recall correctly, OpenGL cannot write to a buffer object when it is bound to cuda (only read from it), so I guess you must bind/unbind the buffer to cuda inside the render loop.
The CUDA post processing example posted to this forum answers all your questions.
I got it. Thanks!
Is reading from framebuffer to pixel buffer object done in GPU? I tried reading 54000 times from a framebuffer of size 320 x 240 and it takes 10 sec (using 8800 GTX). Is this considered fast?
Yes, the copy from the framebuffer to the pixel buffer is done entirely on the GPU in video memory. These functions may be optimized in a future CUDA release.
That sounds actually a bit slow to me. 54000x320x240xRGBA in 10sec is 1.5 GByte/sec. You should be able to get this even with downloading over the bus. Are you doing something else in these 10 sec?
I also think that it is a bit slow. It takes as much time as copying to RAM.
Below is how I get the 10sec time.
… rendering into frame and bind pixel buffer (negligible time)…
for(int t = 0 ; t < 54000; t ++)
glReadPixels(0, 0, WIDTH, HEIGHT, GL_RED, GL_UNSIGNED_BYTE, NULL);
… post processing (negligible time) …
Ah, so you are using only one framebuffer channel (GL_RED). This brings the bandwidth down to 395 MByte/sec actually. All I can think of is that the driver is blocking while doing the transfer because the target is a CUDA buffer. glReadPixels with two OpenGL buffers does not block (or at least seems to hide that somehow). Did you do a comparison with glReadPixels transfer between framebuffer and a OpenGL PBO?
What do you mean a comparison with glReadPixels transfer between framebuffer and a OpenGL PBO? The transfer is between two OpenGL buffers (framebuffer and pixel buffer). There is no CUDA buffer.
You are right. The PBOs get borrowed from OpenGL by CUDA. My bad.
Then I can think of 2 other reasons:
make sure the PBO isn’t registered with CUDA. I think you are playing around with the image processing demo by Simon. Try to explicitly call pboUnregister to be safe.
what format is your framebuffer? You do a glReadPixels on the GL_RED plane. If the framebuffer is GL_RGBA8 the driver might do a conversion on the CPU.