I’m building a CUDA rendered UI. One feature of the UI library is going to be rendering of an arbitrary number of images. All such images are managed by the UI code as OpenGL textures.
Since the number of images are not known at compile time, I have to use CUDA texture objects rather than texture objects. My question is, what is the best way of mapping bound OpenGL textures to CUDA texture objects for reading? I can’t find any good examples of this. Preferably I would like to read the texture data without doing any copying.
I’ve inserted some sample code below that shows some of the steps:
/** GL handle on texture**/
GLuint m_texId;
/** Interop resource handle**/
cudaGraphicsResource *m_cudaGraphicsResource;
/** CUDA array that the texture is mapped to **/
cudaArray *m_cudaArray;
/** reference to exture to read data through*/
cudaTextureObject_t m_texture;
/** reference to surface to write data to*/
cudaSurfaceObject_t m_surface;
..
glGenTextures(1, &m_texId);
glBindTexture(GL_TEXTURE_2D, m_texId);
...
// register (texture) to CUDA Resource
cudaGraphicsGLRegisterImage(&m_cudaGraphicsResource,
m_texId, GL_TEXTURE_2D, cudaGraphicsRegisterFlagsSurfaceLoadStore);
//
// MAP
//
cudaGraphicsMapResources(1, &m_cudaGraphicsResource, m_stream.get());
cudaGraphicsSubResourceGetMappedArray(&m_cudaArray, m_cudaGraphicsResource, 0, 0);
// Do something with cuda texture object and surface object...
You can also have a look at the Mandelbrot sample that comes with the CUDA SDK. It uses the GPU to calculate a fractal image that is rendered to an OpenGL texture.
Thanks for the example code! It is not obvious to me how the m_texture fits into all this though. Just to be clear, I’m only trying to read data from an existing GL texture, not write to it.
As you can see from the comments, a texture can be read from, but you need a surface to write to a GL texture. So if you only need read access, ignore code dealing with m_surface.
So the m_texture will be used to fetch elements from the texture.
Some additional stubb codes:
//
// Creating the texture object
/** Descriptor */
cudaResourceDesc m_resDesc;
memset(&this->m_resDesc, 0, sizeof(m_resDesc));
m_resDesc.resType = cudaResourceTypeArray;
// Looks it's that cuda array thingy again
m_resDesc.res.array.array = m_cudaArray;
cudaTextureDesc texDesc;
memset(&texDesc, 0, sizeof(texDesc));
// These need to be set according to your needs:
texDesc.addressMode[0] = cudaAddressModeClamp;
texDesc.addressMode[1] = cudaAddressModeClamp;
texDesc.filterMode = cudaFilterModePoint;
texDesc.readMode = cudaReadModeElementType;
texDesc.normalizedCoords = false;
// Create texture object
cudaCreateTextureObject(&this->m_texture, &this->m_resDesc, &texDesc, NULL);
So after setting up the texture object like above, you should be able to use it (after the mapping stage!):
template<class T> // some data type....
__global__ void kernel(cudaTextureObject m_texture)
{
int x = threadIdx.x + blockIdx.x*blocDim.x;
int y = threadIdy.y + blockIdx.y*blocDim.y;
T val = tex2D<T>(m_texture, x, y);
printf("\n image value = %d", (int)val);
}
DISCLAIMER: These are all tiny code snippets from my un-released open source project, I make no guarantees whatsoever :)
I don’t get why a cudaArray needs to be involved just to read a GL texture, but then again, I have since long given up on actually trying to form a reasonable mental model of how the CUDA/OpenGL interop types relate to eachother.
It’s been a long time, but as I recall, the way a cudaArray is arranged in physical memory (optimized for 2D spatial locality) corresponds exactly to how OpenGL storage is arranged.