cudaMemcpy only works in emulation mode.

I am just starting to learn how to use CUDA. I am trying to run some simple example code:

[codebox]

float *ah, *bh, *ad, *bd;

ah = (float *)malloc(sizeof(float)*4);

bh = (float *)malloc(sizeof(float)*4);

cudaMalloc((void **) &ad, sizeof(float)*4);

cudaMalloc((void **) &bd, sizeof(float)*4);

… initialize ah …

/* copy array on device */

cudaMemcpy(ad,ah,sizeof(float)*N,cudaMemcpyHostToDevice);

cudaMemcpy(bd,ad,sizeof(float)*N,cudaMemcpyDeviceToDevice);

cudaMemcpy(bh,bd,sizeof(float)*N,cudaMemcpyDeviceToHost);

[/codebox]

When I run in emulation mode (nvcc -deviceemu) it runs fine (and actually copies the array).

But when I run it in regular mode, it runs w/o error, but never copies the data. It’s as if the cudaMemcpy lines are just ignored.

I am doing something wrong?

Thank you very much,

Jason

What is the value of N here? Hopefully 4. :) If so, this should work fine…

In the code you are compiling/running, are you doing any error checking to ensure that cudaMalloc() succeeds? If not, try:

# include <cuda_runtime.h>

...

cudaError_t ret;

if( (ret = cudaMalloc((void**)&dev, size)) != cudaSuccess ) {

	fprintf( stderr, "cudaMalloc() failed: %s\n", cudaGetErrorString( ret ) );

	return -1;

}

...

You can do similar with cudaMemcpy() to see if it reports an error.