Using 2D array in CUDA

kk1ets · July 13, 2015, 3:01pm

Hi all! I am new to CUDA. Can any of you give me an example of how to access the member elements of a 2D array which I declared using cudaMallocPitch? I have tried using:

cudaMallocPitch(&dev_arr,&pitch_arr,colsizesizeof(int),rowsize);
cudaMemcpy2D(dev_arr,pitch_arr,host_arr,colsizesizeof(int),colsize*sizeof(int),rowsize,cudaMemcpyHostToDevice);

for(int i=0;i<rowsize;i++)
{
int *row = (int *)((char )dev_arr+pitch_arri);
for(int j=0;j<colsize;j++)
{
int sample = row[j];
}
}

But this does not run. It returns error code 77. Can you please point out error/mistake in the above code snippet?

Thanks in advance!

cbuchner1 · July 13, 2015, 3:21pm

CUDA arrays are opaque objects on the GPU, with data being reordered using a proprietary space filling curve.

http://docs.nvidia.com/cuda/cuda-c-programming-guide/#axzz3fmjw7brr

The only way to access the values on the device would be by binding the CUDA array to a surface or texture. Surfaces also allow write access (without cache coherence), as far as I know.

NOTE: only a CUDA array created with the cudaArraySurfaceLoadStore flag, can be read and written via a surface object or surface reference.

EDIT: hmm, after looking at your code again I notice you indeed use 2D pitched memory, not cudaArrays. Hence please disregard what I wrote above.

Robert_Crovella · July 13, 2015, 3:36pm

You should provide a short, complete code if you want help.

kk1ets · July 13, 2015, 8:29pm

Thank you all for your views.

Here is a sample code, similar to which I am working on.

void main()
 {
	 int connrow=8,conncol=7;
	 int **CONNEC; //connrow*conncol 2D matrix-- sample values given below
	 SOLVE(CONNEC,connrow,conncol);
	 return;
 }

 cudaError_t SOLVE(int **CONNEC,int connrow,int conncol)
 {
	 cudaError_t cudaStatus;
	 int *dev_CONNEC;
	 size_t pitch_CONNEC;
	cudaMallocPitch(&dev_CONNEC,&pitch_CONNEC,conncol*sizeof(int),connrow);
	cudaMemcpy2D(dev_CONNEC,pitch_CONNEC,CONNEC,conncol*sizeof(int),conncol*sizeof(int),connrow,cudaMemcpyHostToDevice);
        //Usually I find pitch_CONNEC=512
	cudaStatus = cudaGetLastError();
    if (cudaStatus != cudaSuccess) {
        fprintf(stderr, "1---kernel launch failed: %s\n", cudaGetErrorString(cudaStatus));
        return cudaStatus;
    }
	getSOL<<<1,16>>>(dev_CONNEC,pitch_CONNEC,connrow,conncol); //parallelising according to connrow
 }

 //kernel function
 __global__ getSOL(int *dev_CONNEC,size_t pitch_CONNEC,int connrow,int conncol)
 {
	 int id = blockIdx.x*blockDim.x + threadIdx.x;
	 if(id<connrow)
		XY(dev_CONNEC,pitch_CONNEC,connrow,conncol,id);
 }

 //device function
 __device__ void XY(int *dev_CONNEC,size_t pitch_CONNEC,int connrow,int conncol,int id)
 {
  int sg = 3;

  int cpt = 0;
  cpt += id; 

  for (int i=0;i<sg;i++)
  {
	int *row_CONNEC = (int *)((char*)dev_CONNEC + cpt * pitch_CONNEC)+i;
        int nd = *row_CONNEC; printf ("\nnd = %d",nd);
  }
 }

Sample CONNEC matrix:
19 1 11 3 2 5 4
8 27 11 9 12 7 10
28 16 19 15 17 20 18
31 28 27 32 29 30 33
19 11 23 5 13 21 6
19 23 28 21 25 20 22
28 23 27 25 24 29 26
27 23 11 24 13 12 14

I usually get some random values when “nd” is printed. I would like to know if my way of accessing the array “dev_CONNEC” in the global memory is correct or not.

Robert_Crovella · July 13, 2015, 9:00pm

Nobody would have been able to discover your problem based on the original code you posted.

This is still not a complete code, since you haven’t bothered to show how you allocate and initialize the matrix associated with CONNEC. But we can make some headway.

CONNEC is a pointer to a pointer (**)

According to the documentation:

[url]http://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html#group__CUDART__MEMORY_1g3a58270f6775efe56c65ac47843e7cee[/url]

Does the 3rd parameter of cudaMemcpy2D expect a pointer-to-pointer argument?

Suggestions:

Store your host data using a single-pointer argument. It’s not the only way to do it, but it’s simplest.
Do proper CUDA error checking on all CUDA API calls and all kernel calls. If you’re not sure what proper cuda error checking is, google “proper cuda error checking”
run your code with cuda-memcheck
Start with a kernel that just prints out the data you transferred to the device. Once you have that work, computations will be easier to tackle.
Review a cuda sample code that demonstrates proper use of cudaMallocPitch/cudaMemcpy2D, such as the bilateralFilter sample.

kk1ets · July 14, 2015, 6:01pm

Thank you very much for the suggestions. I tried to copy back the copied data from the device to the host (using another host pointer-to-pointer) and I found the data was actually copied successfully (the values matched). But I am facing problems while accessing the 2D array in the device. Is

int *row_CONNEC = (int *)((char*)dev_CONNEC + cpt * pitch_CONNEC)+i;

the correct way of accessing the 2D array (stored in the global memory)?

Please note that previously CONNEC pointer-to-pointer in the host was initialised like this:

int **CONNEC = (int **)malloc(connrow*sizeof(int *));
for(int i=0;i<connrow;i++)
{
   CONNEC[i] = (int *)malloc(conncol*sizeof(int));
}

Robert_Crovella · July 15, 2015, 2:52am

You seem to have ignored my warnings about using a double pointer. You’re welcome to continue on your path if you wish. I’ve already pointed out that it won’t work (yes, I realize you claim it does. We can agree to disagree. I believe the documentation is on my side, as I’ve already linked to.) The double pointer allocation for CONNEC will not work with any cudaMemcpy function. There are not any cudaMemcpy functions that know how to chase a double pointer. Since you don’t seem to want to provide a complete code that someone else could test, I’ll leave it at that.

kk1ets · July 21, 2015, 7:00pm

Hi! Thanks a lot for your advice. I converted all host arrays to 1-dimensional pointers and was successfully able to copy them to the device arrays. I am also getting the required final output.

Topic		Replies	Views
Help with cuda 2d array CUDA Programming and Performance	6	7446	September 29, 2014
Very confused with 2d arrays CUDA Programming and Performance	8	12948	February 17, 2011
CUDA 2D Array Problem Need help to manipulate 2D arrays in CUDA CUDA Programming and Performance	4	26435	March 17, 2011
Problems with creating an array of Cuda pointers CUDA Programming and Performance	7	13533	April 20, 2009
Pointers array CUDA Programming and Performance	7	5562	July 28, 2009
2D Array CUDA Programming and Performance	16	76946	January 20, 2012
help cudaMemcpy2d Trying to modify a 2d array on cuda device CUDA Programming and Performance	8	4975	September 11, 2010
2D array with memcopy2D and Kernel usage CUDA Programming and Performance	4	1282	January 19, 2016
CudaMallocPitch and CudaMemcpy2D CUDA Programming and Performance	7	5473	August 3, 2015
cudaMalloc2D CUDA Programming and Performance	8	5108	November 15, 2014

Using 2D array in CUDA

Related topics