problem in downsampling after rotation in texture memory

I want to do rotation of image in Fourier space. In order to get a more precise result, I do upsampling for 2 factor in real space and do fft2 to get a larger size fft image.
Then I input this larger size fft image to texture memory, and do a rotation. I want to get the original size fft image. So I do downsampling afterwards in texture. But the results are not right. If the rotation angle is 0, pi/2, pi …, the output is right . But other angles are wrong. The wrong outputs all have a wrong center of the fft image.
My code is just like this:

texture<float2, 2, cudaReadModeElementType> tex;

global void
Rotation_CropKernel( float2* g_odata, int width, int height, float Rotangle)
{

// calculate normalized texture coordinates
unsigned int x = blockIdx.x*blockDim.x + threadIdx.x;
unsigned int y = blockIdx.y*blockDim.y + threadIdx.y;

float u = x / (float) width;            //  normalization
float v = y / (float) height;   

// transform coordinates
u -= 0.5f;
v -= 0.5f;   

float theta = Rotangle;

float tu = u*cosf(theta)  + v*sinf(theta) ;             //rotate counter clockwise
float tv = -u*sinf(theta) + v*cosf(theta) ;

 tu = 2* (tu + 0.5f) ;                                      //downsampling
 tv = 2* (tv + 0.5f) ;

// read from texture and write to global memory

if ((y<width/2)&&(x<height/2))
    g_odata[(y*width/2)+ x] = tex2D(tex, tu, tv);   

}

g_odata is an output memory in device. Its size is width/2 * height/2
I attach two results here. The original image size is 64*64.

I am a beginner in CUDA programming. I do not know how to solve this problem, maybe it comes from a silly mistake. Could anyone can help me? Thank you very much.
pi_3.png
pi_2.png

I want to do rotation of image in Fourier space. In order to get a more precise result, I do upsampling for 2 factor in real space and do fft2 to get a larger size fft image.
Then I input this larger size fft image to texture memory, and do a rotation. I want to get the original size fft image. So I do downsampling afterwards in texture. But the results are not right. If the rotation angle is 0, pi/2, pi …, the output is right . But other angles are wrong. The wrong outputs all have a wrong center of the fft image.
My code is just like this:

texture<float2, 2, cudaReadModeElementType> tex;

global void
Rotation_CropKernel( float2* g_odata, int width, int height, float Rotangle)
{

// calculate normalized texture coordinates
unsigned int x = blockIdx.x*blockDim.x + threadIdx.x;
unsigned int y = blockIdx.y*blockDim.y + threadIdx.y;

float u = x / (float) width;            //  normalization
float v = y / (float) height;   

// transform coordinates
u -= 0.5f;
v -= 0.5f;   

float theta = Rotangle;

float tu = u*cosf(theta)  + v*sinf(theta) ;             //rotate counter clockwise
float tv = -u*sinf(theta) + v*cosf(theta) ;

 tu = 2* (tu + 0.5f) ;                                      //downsampling
 tv = 2* (tv + 0.5f) ;

// read from texture and write to global memory

if ((y<width/2)&&(x<height/2))
    g_odata[(y*width/2)+ x] = tex2D(tex, tu, tv);   

}

g_odata is an output memory in device. Its size is width/2 * height/2
I attach two results here. The original image size is 64*64.

I am a beginner in CUDA programming. I do not know how to solve this problem, maybe it comes from a silly mistake. Could anyone can help me? Thank you very much.