Hi,
I am doing some image warping by grid deformation. After I compute my new warp grid, I can display the warped image taking advantage of OpenGL’s Texture2D coordinates as follows:
glTexSubImage2D( GL_TEXTURE_2D, 0, 0, 0, im_w, im_h, GL_RGBA, GL_UNSIGNED_BYTE, BUFFER_DATA(0));
glBegin(GL_QUADS);
for each square in the grid
glTexCoord2f(texture_corner1); glVertex2f(square_corner1);
glTexCoord2f(texture_corner2); glVertex2f(square_corner2);
glTexCoord2f(texture_corner3); glVertex2f(square_corner3);
glTexCoord2f(texture_corner4); glVertex2f(square_corner4);
glEnd();
texture_corner = the coordinates of the original grid positions
square_corner = the coordinates of the warped grid positions
This draws a warped image just fine.
Now, I would like to use this warped image in Cuda. I have several options I think. One way is that maybe I could compute the warping in OpenGL and then copy the buffer from OpenGL to CUDA like the postProcessGL example. Whats the speed hit for this kind of data transfer?
It seems like that is a waste, because I should be able to compute the warped texture directly in CUDA. But it is not clear to me how this would be done, does CUDA support anything like that for hardware texturing? (note: this is not just bilinear interpolation, I know it does that).
If I were to compute the math for each pixel without using hardware texturing, I think it would be very expensive because for each destination pixel, I would have to compute the warped square that it comes from, which is a lot of point-inside-square calculations. Is there a better way to do this?
Thanks!