Newbie: Error while device to host memcopy

Hello all.

I’m a newbie trying to code something in cuda as part of my project.
Please see the attached code.

Here i’m trying to do some small scale image manipulation.

1.The input images are read(500 files, size 64x64, 16bits per pixel , raw format ) into ‘fpInputFrames’ array.
2.Copied it into gpu in fpInputFramesGpu. ( line 155 )
3.Allocated an output array fpOutputFramesGpu.
4.Run the kernel ( i have given block and thread size of 64, so that each thread will handle 1 pixel only ) ( it is changed in the uploaded code as 1 - changed while testing ).
The kernel is invoked in loop ( commented now ).
5. Copy the processed data from device to host.
6. Free the resources.

Problems seen in debug mode. ( not in emulation )

The device to host memory copy copies junk / returns cudaErrorLaunchFailure.

what could be the reason ? :sad: :crying:

Thanks in advance. (2.09 KB)

  1. Try a CUT_CHECK_ERROR(" some message "); after your ComputeFrames invocation, if there is a launch failure of the memcopy then most likely the last kernel invocation failed for some reason.

__global__ static void ComputeFrames( int nSliceProjectionZ, int wframeNum, float*& ppbtFrameData, float*& m_pppVolumetricData  )



	int nRowIndexY = blockIdx.x;

	int nColumnIndexX = threadIdx.x;


	unsigned int uiIndx = nSliceProjectionZ * VOLUME_DIM_X * VOLUME_DIM_Y + nRowIndexY*VOLUME_DIM_X + nColumnIndexX;


    m_pppVolumetricData[ uiIndx  ] = 0;

	/** comemnted code */      


Whats the point of the “float*&” ? I’m not sure about this but it looks like you are writing to the adress, the first value stored in m_pppVolumetricData points to(which is set to zero by your cudaMemset). Again I’m not sure about this but I would suggest removing the “&” might help.


Thank you Vrahok …

I changed the ‘&’ and it worked now. ( but it worked in emulation mode :dry: )