double precision problem

Hi,

__global__ void

cudaConvert16s64f(short *cuSrc, double *cuDst, int len)

{

	// Using 2-D thread block

	// Global Thread ID

	unsigned int idx = (blockIdx.y * gridDim.x + blockIdx.x) * blockDim.x + 

  threadIdx.x;

	

	if (idx < len)

	{

  // Make float2

  cuDst[idx] = (double)(cuSrc[idx]);

  __syncthreads();

	}

} // End of cudaConvert16s64f

Is there anything wrong with my code?

When I put in a short array and run the code, the double array don’t seem to have any value. I tried to copy out the array to HOST and print out the values. The values are all 0.

did you compile with -arch sm_13?