Static 2D array problem

cudacodemonkey · September 22, 2009, 6:00pm

Hi all,

So simply, i am trying to do this.

declare a static 2D float array on host.
transfer it to the device
access the array on device just like i wud do at the host, (using )
process the array and return the array to host and print.

I obviously havent been able to do it, so am here. My question is , if it is at all possible to do with CUDA?.

If so, what is the problem with my code below:

float st_temp[4][4];

	float **dev_temp;

	for(int i=0;i<4;i++)

		for( int j=0;j<4;j++)

			 st_temp[i][j] = (float)rand()/RAND_MAX;

	for(int i=0;i<4;i++)

		for(int j=0;j<4;j++)

			printf("\n THE STATIC ARRAY ELEMENT : %.1f ",st_temp[i][j]);

	

	size_t dev_pitch,st_pitch;

	cudaMallocPitch((void**) &dev_temp, &dev_pitch, 4*sizeof(float), 4);

 	

	cudaMemcpy2D((void *)dev_temp,dev_pitch,(const void *)st_temp,dev_pitch, 4*sizeof(float *),4, cudaMemcpyHostToDevice);

...........

// gpu kernel

__global__ void kernel(float** a, float** b, float **c)

{ 

	

	

	for(int i=0;i<4;i++)

		for(int j=0;j<4;j++)

			printf("\n THE STATIC ARRAY ELEMENT : %.1f ",c[i][j]);

}

Thanks in advance!!

LSChien · September 23, 2009, 12:45am

compiler does row-major map such that st_temp[i][j] = st_temp( i*4 + j ), but compiler must

know dimension of st_temp, that is why you declare

float st_temp[4][4];

However in your kernel,

[codebox]global void kernel(float** a, float** b, float **c)

{

for(int i=0;i<4;i++)

    for(int j=0;j<4;j++)

        printf("\n THE STATIC ARRAY ELEMENT : %.1f ",c[i][j]);

}[/codebox]

Compiler cannot map c[i][j] since no dimension of c is known at compiler time.

Question: why not using 1-D array and do index transformation manually?

cudacodemonkey · October 7, 2009, 3:43pm

compiler does row-major map such that st_temp[i][j] = st_temp( i*4 + j ), but compiler must

know dimension of st_temp, that is why you declare
float st_temp[4][4];
However in your kernel,

[codebox]global void kernel(float** a, float** b, float **c)

{

for(int i=0;i<4;i++)
    for(int j=0;j<4;j++)

        printf("\n THE STATIC ARRAY ELEMENT : %.1f ",c[i][j]);
}[/codebox]

Compiler cannot map c[i][j] since no dimension of c is known at compiler time.

Question: why not using 1-D array and do index transformation manually?

Here is what i did to solve it in 3 simple steps

Declare 2D array in main :

float a[N][N],c[N-2][N-2];

Pass array to kernel:

kernel<<<dimGrid,dimBlock>>>((float() [N])a,(float() [N-2])c);

Kernel Signature:

global void kernel(float a[N][N], float b[N-2][N-2])

Access array in kernel as a regular 2D array syntax.

Note that i was more concerned about the syntax than the internal implementation and its meanings.

hope it helps to those who are concerned abt syntax like me. :)

thanx for ur interest.

-peace

LSChien · October 8, 2009, 12:58am

I try simple case, data copy, it works

static __global__ void test_copy2D( doublereal odata[N][N], doublereal idata[N][N])

{	

	// read the matrix tile into shared memory

	unsigned int xIndex = blockIdx.x * BLOCK_DIM + threadIdx.x;

	unsigned int yIndex = blockIdx.y * BLOCK_DIM + threadIdx.y;

	if((xIndex < N) && (yIndex < N))

	{

		odata[yIndex][xIndex] = idata[yIndex][xIndex];

	}

}

Could you post whole your code ?

besides, fix dimension in kernel function is not good.

sigismondo · October 8, 2009, 7:28am

Hi,

2D arrays are not double pointers (pointers to pointers, arrays of arrays): they are used in the same way, with , but they are different things. Since you are speaking of static arrays, it is enough you replace their declaration:

__global__ void kernel(float a[N][], float b[N][], float c[N][])

Now the compiler will know how to access the elements, but notice that you will not be able to work with “pitched” arrays: if you have N oddly dimensioned, you are going to encounter performance troubles. View the CUDA programming guide about cudaMallocPitch() and the related example for reference.

Sarnath · October 8, 2009, 9:07am

I am glad I found one more monkey admirer in this forum!

Topic		Replies	Views
Passing a multidimensional array to kernel how to allocate space in host and pass to device? CUDA Programming and Performance	12	16160	November 22, 2014
Using 2D array in CUDA CUDA Programming and Performance	7	7247	July 21, 2015
2D Array Not Updated CUDA Programming and Performance	6	5232	May 4, 2010
How can I allocate 2-dimensional array on the device memory? CUDA Programming and Performance	5	15708	August 6, 2009
Help with cuda 2d array CUDA Programming and Performance	6	7446	September 29, 2014
2D array with memcopy2D and Kernel usage CUDA Programming and Performance	4	1278	January 19, 2016
CUDA 2D Array Problem Need help to manipulate 2D arrays in CUDA CUDA Programming and Performance	4	26435	March 17, 2011
Problems with creating an array of Cuda pointers CUDA Programming and Performance	7	13530	April 20, 2009
passing an array to a kenel ? CUDA Programming and Performance	9	13057	June 10, 2009
float array to float2 array troubles GPU-Accelerated Libraries	3	2792	June 30, 2017

Static 2D array problem

Related topics