double precision problem


__global__ void

cudaConvert16s64f(short *cuSrc, double *cuDst, int len)


	// Using 2-D thread block

	// Global Thread ID

	unsigned int idx = (blockIdx.y * gridDim.x + blockIdx.x) * blockDim.x + 



	if (idx < len)


  // Make float2

  cuDst[idx] = (double)(cuSrc[idx]);



} // End of cudaConvert16s64f

Is there anything wrong with my code?

When I put in a short array and run the code, the double array don’t seem to have any value. I tried to copy out the array to HOST and print out the values. The values are all 0.

did you compile with -arch sm_13?