I’m wondering whether techniques like render to texture are available in CUDA?
My program is some stereo matching algorithm, and I need intermediate results stored in texture (or 2D cuda array) to make full use of the efficiency of cached texture memory reads. Since in the kernels you can not write to an 2D array, so this seems like a dead-end?
It’s possible to break the program into OpenGL part and CUDA part, so that the render to texture part is done in OpenGL. But to my understanding when you map the frame buffer object from OpenGL to CUDA then you are using the fbo just like a linear memory created with cudaMalloc, and cached texture memory reads are not utilized.
My understandings may be well flawed, so please correct me if I’m wrong. And any advice on how to store intermediate results to the texture memory or 2D cuda array is appreciated!