cuMemcpy2D and cuMemcpy3D problem

I’m relatively new with CUDA. I don’t know may be my question is too naive but I spent all the day trying to solve this issue!

First of all there is a very low amount of information about the Driver API functions and how to use it. even the programming guide give a very low amount of information about some of them and the others remain without any information at all!

I’m trying to copy a 2D array from my host memory to a 2D cuda array and associate it with a 2D Texture. There is no error returned from any of the functions but when I’m trying to retrieve data back from the texture it returns just zero. I tried everything but it’s always zero.

In the cu file I have the following code:


texture<float, 2, cudaReadModeElementType> tex; // 2D texture

global void

GetTextureValue(float x,float y,float* sample)


(*sample) = tex2D(tex, x,y);



In my CPP file the code is:


CUresult CudaRes;

//Allocate CUDA Array


ArrDesc.Format = CU_AD_FORMAT_FLOAT;

ArrDesc.NumChannels = 1;

ArrDesc.Width = 200;

ArrDesc.Height = 200;

CudaRes= cuArrayCreate(&GPUArr, &ArrDesc);

CUDA_MEMCPY2D copyParam;

memset(&copyParam, 0, sizeof(copyParam));

float* Data=new float[400];

for(int i=0;i<400;i++)


memset(&copyParam, 0, sizeof(copyParam));

copyParam.dstMemoryType = CU_MEMORYTYPE_ARRAY;

copyParam.dstArray = GPUArr;

copyParam.srcMemoryType = CU_MEMORYTYPE_HOST;

copyParam.srcHost = (void*)Data;

copyParam.srcPitch = 200 * sizeof(float);

copyParam.WidthInBytes = copyParam.srcPitch;

copyParam.Height = 200;

CudaRes= cuMemcpy2D(&copyParam);

CudaRes=cuModuleGetTexRef(&DataTex, cuModule, “tex”);

CudaRes=cuTexRefSetAddressMode(DataTex, 0, CU_TR_ADDRESS_MODE_WRAP);

CudaRes=cuTexRefSetAddressMode(DataTex, 1, CU_TR_ADDRESS_MODE_WRAP);

CudaRes=cuTexRefSetFilterMode(DataTex, CU_TR_FILTER_MODE_LINEAR);

CudaRes=cuTexRefSetFormat(DataTex, CU_AD_FORMAT_FLOAT, 1);

CudaRes=cuTexRefSetArray(DataTex, GPUArr, CU_TRSA_OVERRIDE_FORMAT);


CudaRes= cuMemAlloc(&TextureLookupOutput,sizeof(float));

for(int i=0;i<200;i++)


for(int j=0;j<200;j++)


	int offset = 0;


	CudaRes=cuParamSetf(GetTextureValue, offset, i);

	offset += sizeof(float);


	CudaRes=cuParamSetf(GetTextureValue, offset, j);

	offset += sizeof(float);




	offset += sizeof(TextureLookupOutput);





	float GPUValue;






Can any one know what’s the error , or even what Can I do more to know it?

Don’t you have an enormous size mismatch between the host source array Data (which is 400 floats) and what you are passing to the cuMemcpy2D call (which is going to try and copy 40,000 floats)? I don’t know that it will cause the copy to fail, but it will certainly leave you with the device array full of garbage. The texture association calls might not work if the device data is invalid.

Sorry It was a mistake in writing the code on the forum. The array definition and the copy is for 20 * 20 only. What make it strange is that it’s always zero. It gives me the same result when I remove the Memory copy step. as if the memory copy isn’t working or has no effect.

I hope that anyone can help

After Some other trials I think that all the code isn’t executed (Although it return SUCCESS) . I mean creating the array and creating the texture has no effect. If I removed them I have the same results !!!

Any hint , what I should do to know the error ?