Problem with output of 2D Texture

Hi there,
I have a 16*16 array as a 1D array( row-precision wise) that i am trying to map in 2D texture but i am getting 0 values output.

My code is as below

int width=16;
int height=16;
int N = width * height;
float d_out;
*)&d_out, sizeof(float) * N);
float h_out = (float)malloc(sizeof(float)N);
float data = (float)malloc(N
for (int i = 0; i < N; i++)
data[i] = i;
cudaChannelFormatDesc cf= cudaCreateChannelDesc(32,0,0,0,cudaChannelFormatKindFloat);
cudaArray* cuArray;
cudaMallocArray(&cuArray, &cf,width,height);
cudaMemcpyToArray(cuArray, 0, 0, data, sizeof(float)*N, cudaMemcpyHostToDevice);
texRef.addressMode[0] = cudaAddressModeClamp;
texRef.addressMode[1] = cudaAddressModeClamp;
texRef.filterMode = cudaFilterModePoint;
texRef.normalized = false;
cudaBindTextureToArray (texRef, cuArray,cf);
dim3 dimBlock(width, height, 1);
dim3 dimGrid(width / dimBlock.x, height / dimBlock.y, 1);
checkCUDAError(“sync thread”);
cudaMemcpy(h_out, d_out, sizeof(float)*N, cudaMemcpyDeviceToHost);
for (int i = 0; i < N; i++)
{ printf("%f ",h_out[i]); }

In kernel i have following code:-

texture<float, 2, cudaReadModeElementType> texRef;

global void readTexels(float* d_out,int width, int height)
unsigned int i = blockIdx.xblockDim.x + threadIdx.x;
unsigned int j = blockIdx.y
blockDim.y + threadIdx.y;
float posi = i / (float) width;
float posj = j / (float) height;
d_out[j*width+i] = tex2D(texRef,posi,posj);

I get all th otuput values 0 when using cudaFilterModePoint but i get strange values when using cudaFilterModeLinear.

I want the exact value as output as i gav input??

What am i doing wrong here??

help me

Since you have “texRef.normalized = false”, the tex2D call is expecting integer texture coordinates, but you are passing normalized coordinates (0.0 - 1.0). Either “set texRef.normalized = true”, or use “tex2D(texRef, i, j)”;