Problems passing doubles to/from kernel - they become 0!

mgibbons · November 20, 2008, 9:53am

Wonder if anyone can help with this.

The test rig - which I’ve attached - seems to indicate that there is a problem with the passing of doubles to and from the host and device.

Note that this code all works fine when run in emulation and we see the same - wrong - results when running on either a NVS290 or C1060.

You can see from the code that the double array is being set correctly to 3 on the device and then copied into the integer array but when the double array is passed back to the host it is set to 0!

Similarly, if I try to pass a double into the kernel and use it it always appears to be zero when used in the kernel code.

When I change the type from double to float it all works!

[font=“Courier New”]

=== Kernel =======================================

#define VARTYPE double

[font=“Courier New”]global void Init1(
VARTYPE d_array,
int d_iArray,
int ivalue,
VARTYPE value,
int size
){
//Thread index
const int tid = blockDim.x * blockIdx.x + threadIdx.x;

if(tid < size)
{
    d_array[tid] = 3.0f;		  // double array shown as zero on host
    
d_iArray[tid] = d_array[tid];  // integer array is set correctly using this line

//d_iArray[tid] = value; // integer array is set to zero using this line	   		
}

}[/font]

=== Main program ======================================

#define SIZE 1000

#define VARTYPE double

int main(int argc, char **argv){
//‘h_’ prefix - CPU (host) memory space
VARTYPE *h_results;
int *h_Iresults;

//'d_' prefix - GPU (device) memory space
VARTYPE *d_vector;
int    *d_IVector;


double gpuTime;

unsigned int hTimer;
int i;

CUT_DEVICE_INIT(argc, argv);
CUT_SAFE_CALL( cutCreateTimer(&hTimer) );

h_results = (VARTYPE *)malloc(SIZE*sizeof(VARTYPE));
h_Iresults = (int *)malloc(SIZE*sizeof(int));

CUDA_SAFE_CALL( cudaMalloc((void**)&d_vector, SIZE*sizeof(VARTYPE)));
CUDA_SAFE_CALL( cudaMalloc((void**)&d_IVector, SIZE*sizeof(int)));

CUDA_SAFE_CALL( cudaThreadSynchronize() );
CUT_SAFE_CALL( cutResetTimer(hTimer) );
CUT_SAFE_CALL( cutStartTimer(hTimer) );

Init1<<<32, 256>>>(
        d_vector,
        d_IVector,
    2,
        1.0,
        SIZE
        );

CUT_CHECK_ERROR("execution failed\n");
CUDA_SAFE_CALL( cudaThreadSynchronize() );
CUT_SAFE_CALL( cutStopTimer(hTimer) );
gpuTime = cutGetTimerValue(hTimer);

printf("Reading back GPU results...\n");
//Read back GPU results to compare them to CPU results
CUDA_SAFE_CALL( cudaMemcpy(h_results, d_vector, SIZE*sizeof(VARTYPE), cudaMemcpyDeviceToHost) );
CUDA_SAFE_CALL( cudaMemcpy(h_Iresults, d_IVector, SIZE*sizeof(int), cudaMemcpyDeviceToHost) );

for(i=0; i < SIZE; i++)
    printf("i=%d vi=%d vd=%.3f; ", i, h_Iresults[i], h_results[i]);
printf("\n");


printf("Shutting down...\n");
printf("...releasing GPU memory.\n");
CUDA_SAFE_CALL( cudaFree(d_vector)  );
CUDA_SAFE_CALL( cudaFree(d_IVector)  );

printf("...releasing CPU memory.\n");
free(h_results);
free(h_Iresults);
CUT_SAFE_CALL( cutDeleteTimer(hTimer) );
printf("Shutdown done.\n");

CUT_EXIT(argc, argv);

}
[/font]

MisterAnderson42 · November 20, 2008, 1:15pm

Compile with -arch sm_13 when using doubles.

mgibbons · November 20, 2008, 4:06pm

Yep. That was it. Many thanks. Shame that it’s not actually documented anywhere except in the example makefiles!

Topic		Replies	Views
Code works with floats but not doubles CUDA Programming and Performance	4	5045	July 15, 2009
This is driving me nuts! memory access problem.. CUDA Programming and Performance	5	2703	December 7, 2007
Problems with doubles on GTX280 Emu works, float works, double on device fails. CUDA Programming and Performance	5	5132	November 11, 2008
Simple memcpy not working CUDA Programming and Performance	4	7066	January 25, 2008
using double in kernel Rob Farber's Dr.Dobb's code using double CUDA Programming and Performance	3	7234	January 3, 2009
worked fine for "int" "float" but NOT "double" CUDA Programming and Performance	13	5082	March 9, 2009
Bug at Memcpy with double. CUDA Programming and Performance	6	5590	September 7, 2009
Problem with function (memory movements?) CUDA Programming and Performance	0	1329	September 19, 2009
Arrays of structs in device memory CUDA Programming and Performance	5	1632	October 17, 2010
Unexpected behavior using doubles in a kernel CUDA Programming and Performance	1	3064	December 4, 2007

Problems passing doubles to/from kernel - they become 0!

Related topics