Hello all,
I am trying to do a simple memcpy with Cuda 1.1 on a Windows XP machine with a GeForce 8600 GTS board. I am compiling under MS Visual Studio 2005 and I have used the template application for a test.
Here is the code for the host:
/* [2007.12.29 Phil Pratt-Szeliga] : CUDA */
#include <cutil.h>
#include "gpu_kernel.cu"
double tilted_angle[513] = {0.0};
main (int argc, char *argv[])
{
CUT_DEVICE_INIT();
dim3 grid( 1, 1, 1);
dim3 threads( 1, 1, 1);
double *d_tilted_angle;
CUDA_SAFE_CALL( cudaMalloc( (void**) &d_tilted_angle, sizeof(tilted_angle)));
CUDA_SAFE_CALL( cudaMemcpy( d_tilted_angle, tilted_angle, sizeof(tilted_angle),cudaMemcpyHostToDevice) );
slice_kernel<<<grid, threads>>>(d_tilted_angle);
CUT_CHECK_ERROR("Kernel execution failed");
CUDA_SAFE_CALL(cudaThreadSynchronize());
CUT_CHECK_ERROR("Kernel execution failed");
//copy the memory back
CUDA_SAFE_CALL( cudaMemcpy( tilted_angle, d_tilted_angle, sizeof(tilted_angle),cudaMemcpyDeviceToHost) );
//free the allocated memory
CUDA_SAFE_CALL(cudaFree(d_tilted_angle));
}
And here is the code for the device:
#ifndef _TEMPLATE_KERNEL_H_
#define _TEMPLATE_KERNEL_H_
#include <stdio.h>
__global__ void slice_kernel(double * tilted_angle)
{
for(int i = 0; i < 513; i++){
tilted_angle[i] = 15.5;
}
return;
}
#endif // #ifndef _TEMPLATE_KERNEL_H_
Here is my problem: (In Debug mode) Before I call the cudaMemcpy (device to host) tilted_angle on the host is all zeros. After I call cudaMemcpy tilted_angle takes on values like: 5.426734841397e-315#DEN
However, in EmuDebug mode tilted_angle takes on values like 15.5 in main.
Does anyone know what is going on?
Thanks
Phil