I tried to get some deeper understanding of how to make use of the texture memory in CUDA, but the more I look at the documentation and into the header files, the more confused I get (the problem is not to get it working somehow).
(1) Although there are some “low level” C API functions, it seems not to be possible to get around some C++ API calls. Even the example in the Programming Guide (which has also some errors in it…) -although stating using the C API- instantiates a texture<> template first and retrieves a textureReference* through some obscure function through the textures variable name.
(2) Another post in this forum indicated, that it is necessary to put a texture<> object in the global namespace to make it accessible from a kernel, nvcc wants to do some magic behind the scenes (which might also be related to the C-API name lookup mentioned in (1)). How does this work in the context of MultiGPU-computing? Assume I have a kernel that makes use of texture memory, which should be executed on several GPUs. As each (cuda) thread needs its own texture<>, I’d also need different kernels, each of which references another texture<>.
(3) The usage of the struct cudaChannelFormatDesc looks totally crazy to me. There is struct cudaChannelFormatDesc member in struct textureReference (a base of texture<>), but in all examples another object is created temporarily. Obviously, most members (texture size, pitch) are neither configured through this temporary nor the member of the textureReference base of texture<>, but passed separately to the bind function (which might in turn change the textureReference’s members).
So based on this my questions are:
(A) Is it possible to use texture memory through the C API only, perhaps by some undocumented approach?
(B) Is putting a texture<> into the scope of the kernel implementation file the only possibilty? How can one run a kernel on multiple GPUs withough artifically creating multiple texture objects and associated, otherwise identical kernels versions at compile time?
© Is there any sense in using the additonal struct cudaChannelFormatDesc? Could one also pass a pointer to the textureReference (or in this case texture<>) member?
I heard there is already CUDA 3.0 beta out for registered developers, will it get better there?