I was wondering the way to process a texture in CUDA and then directly display it to the screen, without having to transfer the image data through host memory again.
In the sobel filter example provided with CUDA, they do the following to process a texture and display:
cutilSafeCall(cudaGraphicsMapResources(1, &cuda_pbo_resource, 0));
size_t num_bytes;
cutilSafeCall(cudaGraphicsResourceGetMappedPointer((void **)&data, &num_bytes,
cuda_pbo_resource));
// process texture and write result to data variable
glBindTexture(GL_TEXTURE_2D, texid);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, pbo_buffer);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, imWidth, imHeight,
GL_LUMINANCE, GL_FLOAT, OFFSET(0));
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, 0);
// display texture to screen by enabling GL_TEXTURE_2D, using GL_QUADS
Correct me if I’m wrong, but I believe that the allocated memory pointed to by the ‘data’ variable is in device memory, since we are just directly writing to it in the kernels, and since it is initialized with cudaGraphicsResourceGetMappedPointer.
So I’m thinking this is not going through host memory, but I’m not sure. Is this the best (ie. quickest) way to process a texture and display it directly without going through host memory? If not, how should I do this?
Also, could someone please explain the reason for using the pbo_buffer. Is this necessary?
Thanks a lot!