Hi all,
So simply, i am trying to do this.
declare a static 2D float array on host.
transfer it to the device
access the array on device just like i wud do at the host, (using )
process the array and return the array to host and print.
I obviously havent been able to do it, so am here. My question is , if it is at all possible to do with CUDA?.
If so, what is the problem with my code below:
float st_temp[4][4];
float **dev_temp;
for(int i=0;i<4;i++)
for( int j=0;j<4;j++)
st_temp[i][j] = (float)rand()/RAND_MAX;
for(int i=0;i<4;i++)
for(int j=0;j<4;j++)
printf("\n THE STATIC ARRAY ELEMENT : %.1f ",st_temp[i][j]);
size_t dev_pitch,st_pitch;
cudaMallocPitch((void**) &dev_temp, &dev_pitch, 4*sizeof(float), 4);
cudaMemcpy2D((void *)dev_temp,dev_pitch,(const void *)st_temp,dev_pitch, 4*sizeof(float *),4, cudaMemcpyHostToDevice);
...........
// gpu kernel
__global__ void kernel(float** a, float** b, float **c)
{
for(int i=0;i<4;i++)
for(int j=0;j<4;j++)
printf("\n THE STATIC ARRAY ELEMENT : %.1f ",c[i][j]);
}
Thanks in advance!!
LSChien
September 23, 2009, 12:45am
2
compiler does row-major map such that st_temp[i][j] = st_temp( i*4 + j ), but compiler must
know dimension of st_temp, that is why you declare
float st_temp[4][4];
However in your kernel,
[codebox]global void kernel(float** a, float** b, float **c)
{
for(int i=0;i<4;i++)
for(int j=0;j<4;j++)
printf("\n THE STATIC ARRAY ELEMENT : %.1f ",c[i][j]);
}[/codebox]
Compiler cannot map c[i][j] since no dimension of c is known at compiler time.
Question: why not using 1-D array and do index transformation manually?
compiler does row-major map such that st_temp[i][j] = st_temp( i*4 + j ), but compiler must
know dimension of st_temp, that is why you declare
float st_temp[4][4];
However in your kernel,
[codebox]global void kernel(float** a, float** b, float **c)
{
for(int i=0;i<4;i++)
for(int j=0;j<4;j++)
printf("\n THE STATIC ARRAY ELEMENT : %.1f ",c[i][j]);
}[/codebox]
Compiler cannot map c[i][j] since no dimension of c is known at compiler time.
Question: why not using 1-D array and do index transformation manually?
Here is what i did to solve it in 3 simple steps
Declare 2D array in main :
float a[N][N],c[N-2][N-2];
Pass array to kernel:
kernel<<<dimGrid,dimBlock>>>((float() [N])a,(float( ) [N-2])c);
Kernel Signature:
global void kernel(float a[N][N], float b[N-2][N-2])
Access array in kernel as a regular 2D array syntax.
Note that i was more concerned about the syntax than the internal implementation and its meanings.
hope it helps to those who are concerned abt syntax like me. :)
thanx for ur interest.
-peace
Here is what i did to solve it in 3 simple steps
Declare 2D array in main :
float a[N][N],c[N-2][N-2];
Pass array to kernel:
kernel<<<dimGrid,dimBlock>>>((float() [N])a,(float( ) [N-2])c);
Kernel Signature:
global void kernel(float a[N][N], float b[N-2][N-2])
Access array in kernel as a regular 2D array syntax.
I try simple case, data copy, it works
static __global__ void test_copy2D( doublereal odata[N][N], doublereal idata[N][N])
{
// read the matrix tile into shared memory
unsigned int xIndex = blockIdx.x * BLOCK_DIM + threadIdx.x;
unsigned int yIndex = blockIdx.y * BLOCK_DIM + threadIdx.y;
if((xIndex < N) && (yIndex < N))
{
odata[yIndex][xIndex] = idata[yIndex][xIndex];
}
}
Could you post whole your code ?
besides, fix dimension in kernel function is not good.
__global__ void kernel(float** a, float** b, float **c)
{
for(int i=0;i<4;i++)
for(int j=0;j<4;j++)
printf("\n THE STATIC ARRAY ELEMENT : %.1f ",c[i][j]);
}
Hi,
2D arrays are not double pointers (pointers to pointers, arrays of arrays): they are used in the same way, with , but they are different things. Since you are speaking of static arrays, it is enough you replace their declaration:
__global__ void kernel(float a[N][], float b[N][], float c[N][])
Now the compiler will know how to access the elements, but notice that you will not be able to work with “pitched” arrays: if you have N oddly dimensioned, you are going to encounter performance troubles. View the CUDA programming guide about cudaMallocPitch() and the related example for reference.
I am glad I found one more monkey admirer in this forum!