Understanding CUDA texture 2D linear interpolation

I am trying to use CUDA textures for 2D interpolation, but I am having trouble with tex2D() function.

I am initializing my cuda texture object with this code:

  // Allocate CUDA array in device memory
cudaChannelFormatDesc channelDesc =
cudaCreateChannelDesc(32, 0, 0, 0, cudaChannelFormatKindFloat);
cudaArray_t cuArray;
cudaMallocArray(&cuArray, &channelDesc, width, height);
// Set pitch of the source (the width in memory in bytes o
const size_t spitch = width*sizeof(float);
// Copy data located at address h_data in host memory to device memory
cudaMemcpy2DToArray(cuArray, 0, 0, h_data, spitch, width * sizeof(float),
height, cudaMemcpyHostToDevice);
// Specify texture
struct cudaResourceDesc resDesc;
memset(&resDesc, 0, sizeof(resDesc));
resDesc.resType = cudaResourceTypeArray;
resDesc.res.array.array = cuArray;
// Specify texture object parameters
struct cudaTextureDesc texDesc;
memset(&texDesc, 0, sizeof(texDesc));
texDesc.addressMode[0] = cudaAddressModeWrap;
texDesc.addressMode[1] = cudaAddressModeWrap;
texDesc.filterMode = cudaFilterModeLinear;
texDesc.readMode = cudaReadModeElementType;
texDesc.normalizedCoords = 1;
// Create texture object
texObj = 0;
cudaCreateTextureObject(&texObj, &resDesc, &texDesc, NULL);

where h_data is a 1D array that has the information of the mathematical function I want to interpolate.

I tried to interpolate f(x,y)=x in a rectangle [0,10]x[0,20], with a grid of 32 x 128 points. Since I use normalized coordinates in the texture, the transformation of coordinates should be
[0,10]x[0,20] → [0,31]x[0,127] → [0,1]x[0,1]

If the grid of points is defined by

for(int i=0; i<32; i++){
   for(int j=0;  j<128; j++){

Then, to use texture interpolation in the interior points of the grid, one should do

__global__ void testTexture(cudaTextureObject_t tex, float* d_output){
   for(int i=1; i<31; i++){
      for(int j=1;  j<127; j++){
        float x=i/31.0;
        float y=j/127.0;

But doing this I get a systematic error of 0.0155 in every single point. What is happening?

I perused this very quickly and therefore may be off the mark, but I think you need x=((i+0.5f)/31.0f) and y=((j+0.5f)/127.0f) here so you hit the center of each texel.

Now, with that change, the error at each point increased to ~0.026. Why, if I want to evaluate the texture at the original points to check if the interpolation is indeed linear, I have to add 0.5f to the coordinates? I saw many posts with those 0.5f and I can’t understand why? Wouldn’t it become a problem when you are close to the upper limit of the coordinates?

EDIT:_ after printing out the solution on Matlab, I’m seeing that the error is not constant. It is constant in the y-direction, but increasing in the x-direction. near x=0 the error is equal to 0.322581 and near x=10 the error is 0.015122.
Evaluating a linear function on the same points of the Look-up table with linear interpolating method should give a linear solution, but instead we get this.

I may have bugs in my code, or I’m not understanding right how the textures work, but that’s why I posted this, because it doesn’t make any sense for me.

If the error increased, you would want to ignore what I said and do you win analysis. I do not have time to look at this in detail and have typically used unnormalized coordinates in my work. In that case one definitely needs to add 0.5 to the coordinates to hit the center of each pixel. As to why, consider the case of one texel covering [0, 1). The extreme values would equate to the edges of the pixel.

Generally, with texture interpolation, you will observe quantization error, as the interpolation uses 1.8 fixed-point arithmetic (the details are described in an appendix of the CUDA Programming Guide), meaning there will be 0.4% error on average. This is a classical tradeoff between performance and accuracy: The GPU’s built-in interpolation is fast, but provides low accuracy. For higher accuracy interpolation you could implement the 2D interpolation formula using fmaf() as much as possible.

This is in fact what applications do that require accurate interpolation, and often the performance impact is relatively small since FMA operations are “too cheap to meter” as application performance is frequently dominated by data movement these days.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.