memcpy problem

hi,

i have this function:

extern "C" void cuMagnitude(cufftComplex *src, int *dst,int len){

   CUT_DEVICE_INIT();

	

	cufftComplex *srcDevice;

	int *dstDevice;

	

	CUDA_SAFE_CALL(cudaMalloc((void**)&srcDevice, len*sizeof(cufftComplex)));

	CUDA_SAFE_CALL(cudaMalloc((void**)&dstDevice, len*sizeof(int)));

	

	CUDA_SAFE_CALL(cudaMemcpy(srcDevice, src, len*sizeof(cufftComplex), cudaMemcpyHostToDevice));

	

	for(int i=0;i<len;i++){

  dstDevice[i]=sqrt(powf(srcDevice[i].x,2)+powf(srcDevice[i].y,2))/32768+0.5;

	}

	

	CUDA_SAFE_CALL(cudaMemcpy(dst, dstDevice, len*sizeof(int), cudaMemcpyDeviceToHost));

	

	CUDA_SAFE_CALL(cudaFree(srcDevice));

	CUDA_SAFE_CALL(cudaFree(dstDevice));

	

}

the variables look allocated with valid adresses at the beggining of the function, so, when it gets to

CUDA_SAFE_CALL(cudaMemcpy(srcDevice, src, len*sizeof(cufftComplex), cudaMemcpyHostToDevice));

srcDevice becomes invalid… can’t access it …

what can be the problem???

EDIT: i changed the order of the variables and now the problem is in dstDevice but in the same line…

also, the program continues after this line but srcDevice is never set correctly.

i don’t understand these cuda problems :\

You cannot directly read from or write to device memory from the host. Only memory copies to and from the device can be done.

so i have to put all the math operations code in a global function, is that it?

is that referenced in the programming guide or somewhere?? i didn’t know that…
and i think this is the source of all my problems :S

thank you

Yeah, this is pretty fundamental to CUDA. Device memory is on the graphics card, and separated by a PCI-Express bus from the CPU, so you have explicitly copy data to or from the device when you need it. All other operations on device memory happen in global functions.

i understood the need to copy data from host to device from all the samples i saw, but i thought i could put the device code in the same function…

maybe the people who write the programming guide could put this explanation there?
or if it is there, point me to the right place, because i didn’t find it …

thanks again

This is the first relevant quote I found, section 4.2.2.4 (in CUDA 2.0 guide, not sure what section # it is in earlier guides):

“Dereferencing a pointer either to global or shared memory in code that is executed
on the host or to host memory in code that is executed on the device results in an
undefined behavior, most often in a segmentation fault and application termination.”