Texturing hardware capabilities What's not available in CUDA that is exposed in OpenGL?

Hi,

I have seen many comments about various aspects of GPU architecture that are not exposed to a CUDA programmer, that are exposed to someone using OpenGL. Not being a graphics person (at least not yet), this has got me wondering what exactly is not exposed (that could potentially be useful for scientific algorithms)?

In particular, I am curious about what the texture hardware can do. Currently, it is possible to bind 1d, 2d, or 3d arrays to textures and read them into CUDA cores through the texture cache. As well as obtaining possible speedup with data reuse because of the cache, this also allows one to do linear interpolation on these arrays (as described in the appendix of the programming guide). Can the texturing hardware do anything else? In particular, I am interested in the case where by my textures are composed of vectors (float2, float3, half4, etc.). Does the texture unit have the ability to perform interpolation within the vector, or to rotate some of the array components (swizzling ?) before interpolating? Do the coefficients of the interpolation have to be positive, or are you allowed negative values?

I understand that DirectX 11 requires that textures are also writable to, and no just read only. Presumably this ability will exposed in CUDA 3.x for Fermi GPUs?

On the subject of hardware that isn’t yet exposed to CUDA, what does “rendering to the framebuffer” mean. I understand that this is not exposed in CUDA, but effectively allows one to floating point atomics, and so allows scatter type algorithms (if one writes in OpenGL). Any reason why this ability isn’t exposed to CUDA?

Thanks in advance.

The texture hardware does support interpolation of float2/float4 etc. values. It can’t interpolate between the components, if that’s what you mean. You don’t get to specify the interpolation weights directly, it just does bilinear interpolation based on the floating point texture coordinates you specify.

Texture writes are already possible in OpenCL and will be exposed in CUDA soon.

I’d recommend reading the OpenGL Red book (or the online specification) to get a better understanding of graphics features.

Thanks for the reply. I did mean interpolation between the components: in one of my problems I am using GPUs for I have arrays of complex numbers (float2, half2, etc.). I need to average over these fields where my interpolation weight itself a complex number (so would lead to mixing the .x and .y components) before doing some computation. I was wondering if I could use the texture hardware to do this for, leaving more resources free for the rest of the computation (both flops and registers space). With the current constraints, I don’t see that this is possible, though writing to textures perhaps opens up some possibilities here.

Do the weights have to be positive?

I shall consult the red book as suggested.