cudaFilterModeLinear my fault or bug?


this is my scaleDown kernel. It is supposed to reduce an image to width/2 and height/2 using linear interpolation. The image date is stored in the x,y, z components of a float4 cuda array.


 * reduces the size of an image a x b to a/2 x b/2 using 9bit precision floatingpoint interpolation                                                 

 * kernel expects a texture reference tex_image to be set to the input image                                                                        

 * width - width of resulting image                                                                                                                 

 * height - height of resulting image                                                                                                               

 * pitch - pitch of 2D global memory array from cudaMalloc2D                                                                                        


__global__ void ScaleDownKernel(unsigned int width,unsigned int height,float4* data, unsigned int pitch) {

    int px = threadIdx.x + blockDim.x * blockIdx.x;

    int py = threadIdx.y + blockDim.y * blockIdx.y;

   if(px < width && py < height) { // check if px,py within boundaries                                                                             

        float4* row = (float4*)((char*)data + py * pitch);

        row[px] =  tex2D(tex_image,2.0f*(float)px + 1.0f , 2.0f*(float)py + 1.0f);

        //row[px] =  tex2D(tex_image,2.0f*(float)px + 0.5f , 2.0f*(float)py + 0.5f);                                                                



The problem is, only with +1.0f do i get the correct interpolation. I would expect this to happen with 0.5f added to the coordinates. +1.0 does not step in between pixels. It seems like coordinates are shifted by 0.5 in cudaFilterModelLinear.

I have tried this out with cudaFilterModePoint and was able to access the correct coordinates.

Any ideas?



We have some code that uses interpolation in both CUDA and OpenGL and to get the same results from both we have to bias the OpenGL coordinates by .5, .5.

It might be that CUDA texture coordinates purposefully point to the middle of the pixel instead of the corner. You say you get correct interpolation if you add 1, what happens if you add nothing?

We’ll see about clearing this up in the documentation. What follows is a description of the CUDA texture behavior.

The “top left” corner of the texture has coordinates (0,0). So, each texels’ “corners” have integer coordinates, whereas the middle of the texel will have a .5 added to the top left texel corner coordinate. Therefore, by providing, say, (0.5f, 0.5f) as coordinates, one is sampling dead in the middel of a texel and the returned value will have no contributions from surrounding texels. By providing (1.0f, 1.0f), one is sampling an “intersection” between 4 texels and will get a value that’s a mean of the corresponding texels.


With nothing added the reduced image is interpolated, but the whole thing is shifted 1 px to the right.

Yes, this would be nice. Especially for “generel-purpose” programmers like me, who do not know much about graphic programming. And general-purposes computation is what cuda is all about, right :-)

Now the behavior becomes clear to me.

Thank you for the explanation.

regards to both of you