errors while usin shared memory error: expression must have constant value in line 16n 17

ankush · August 18, 2009, 7:24am

Hi,

I am a little bit new to CUDA.I have written the following program for matrix multiplication using shared memory. The algorithm for this is given here on page number 9 :

David Kirk, Chapter 4

However, I am getting following errors :

1- expression must have constant value, n these lines

i shared float as[tile_width][tile_width];

shared float bs[tile_width][tile_width];

2-expression must have integral or enum type n these lines

(

as[j][i] = a[Rowwidth+(mtile_width+i)];

bs[j][i] = b[(mtile_width+j)+widthCol];

c[Row*width+Col]=sum;

)

The complete program is given below. I would be thankful to you for any useful pointers in this regard.

Thanks,

Ankush

__global__ void matrixmul(int *a,int *b, int *c, int width,int tile_width)

{

__shared__ float as[tile_width][tile_width];

__shared__ float bs[tile_width][tile_width];

float k;

int i=threadIdx.x; nt j=threadIdx.y;

int p=blockIdx.x; int q=blockIdx.y;

float Row= (q * tile_width) +j;

float Col= (p * tile_width) +i;

float sum=0;

for( float m = 0; m<width/tile_width;++m)

{

as[j][i] = a[Row*width+(m*tile_width+i)];

bs[j][i] = b[(m*tile_width+j)+width*Col];

for(float k=0; k< tile_width; ++k)

{

 sum + = as[j*width+k] * bs[k*width+i];

c[Row*width+Col]=sum;

}

}

}

int main()

{

int i,j,m,n,sum;

int Row,k,Col,tile_width,width;

int *a_h,*b_h,*a_d,*b_d,*c_h,*c_d;

const int N=1000;

size_t size= N*sizeof(int);

printf("enter the tile_width");

scanf("%d",&tile_width);

printf("enter the width");

scanf("%d",&width);

//Memory allocation on host and device, a_h, a_d 

a_h=(int*)malloc(size);

cudaMalloc((void**)&a_d,size);

//Memory allocation on host and device, b_h, b_d 

b_h=(int*)malloc(size);

cudaMalloc((void**)&b_d,size);

//Memory allocation on host and device, c_h, c_d 

c_h=(int*)malloc(size);

cudaMalloc((void**)&c_d,size);

//User inputs (row=columns)  

printf("enter the row & coloum of the 1st matrix m ");

scanf("%d%d",&Row,&k); //row = columns

printf("enter the element of 1st matrix m");

for(i=0;i<(Row*k);i++)

{

scanf("%d",&a_h[i]);

}

for(i=0;i<(Row*k);i++)

printf("\t%d",a_h[i]);

{

printf("\n");

}

//copying data (a_h) from Host to Device in a_d 

cudaMemcpy(a_d,a_h,size,cudaMemcpyHostToDevice);

printf("enter the row & coloum of 2nd matrix n");

scanf("%d%d",&k,&Col); //row = columns

printf("enter the element of 2nd matrix n");

for(j=0;j<(k*Col);j++)

{

scanf("%d",&b_h[j]);

}

for(j=0;j<(k*Col);j++)

printf("\t%d",b_h[j]);

{

printf("\n");

}

//copying data (b_h) from Host to Device in b_d 

cudaMemcpy(b_d,b_h,size,cudaMemcpyHostToDevice);

//Kernal call 

dim3 dimBlock(4,4,1);

dim3 dimGrid(1,1,1);

matrixmul<<<dimGrid, dimBlock>>>(a_d,b_d,c_d,width,tile_width);

//copying data (c_d) from Device to Host in c_h 

cudaMemcpy(c_h,c_d,size,cudaMemcpyDeviceToHost);

//printing the results 

for(i=0;i<(Row*Col);i++)

{

printf("%d",c_h[i]);

printf("\n");

}

getch();

free(a_h);

cudaFree(a_d);

free(b_h);

cudaFree(b_d);

free(c_h);

cudaFree(c_d);

}

SPWorley · August 18, 2009, 7:44am

You can’t dynamically allocated shared memory like this:

__shared__ float as[tile_width][tile_width];

Dynamic allocations happen at kernel invocation, and you’d access the memory like

extern __shared__ float data[];

Look in the programming guide about dynamic shared memory allocation.

avidday · August 18, 2009, 7:44am

You can’t have dynamically sized arrays in C90. You also can’t allocate dynamic memory in CUDA kernels. You can read about how to allocate kernel shared memory at runtime in the “Execution Configuration” section of chapter 4 of the programming guide.

ankush · August 18, 2009, 7:55am

Sorry i have just edited my post.

Ankush

ankush · August 18, 2009, 7:58am

Thanks avidday: But I am running the programs on emulation mode.

avidday · August 18, 2009, 8:06am

Last time I checked, the syntax of the C language were the same in emulation mode as when running on a GPU, so that is sort of irrelevant to your problem. Emulation mode, as the name suggests, emulates running on a GPU. It has the same restrictions on how resources must be specified and the requires the same syntax.

cudacuda2009 · August 20, 2009, 7:53am

You can’t dynamically allocated shared memory like this:
__shared__ float as[tile_width][tile_width];
Dynamic allocations happen at kernel invocation, and you’d access the memory like
extern __shared__ float data[];
Look in the programming guide about dynamic shared memory allocation.

SPWorley And Avidday: Please explain how is this expression dynamically allocating the memory:

[b]shared float as[tile_width][tile_width]

[/b]

I understand that Dynamic memory allocation is possible through calloc and malloc functions in C and cudaMalloc Function in CUDA. In the above expression the value of title_width is passed to the array as[tile_width][tile_width], and thus it becomes static. for example as[16][16].

avidday · August 20, 2009, 9:17am

The value of tile_width is not known to the compiler. In C90, you cannot declare an array whose dimensions are a non-integral value.

Topic		Replies	Views
Shared Memory initialization CUDA Programming and Performance	19	45400	March 26, 2007
Shared Memory Problem CUDA Programming and Performance	5	6383	June 2, 2009
constant and shared memoty CUDA Programming and Performance	8	9195	December 12, 2008
Compiler Error CUDA Programming and Performance	2	4192	April 25, 2007
several dynamic arrays is shared memory CUDA Programming and Performance	2	3428	January 18, 2012
extern __shared__ does not allocate memory CUDA Programming and Performance	1	7503	December 1, 2009
Defining Array in Kernel CUDA Programming and Performance	26	3585	June 29, 2009
Problem with constant value CUDA Programming and Performance	1	25350	January 8, 2009
Shared Memory Application Matrix Multipication Using Shared Memory CUDA Programming and Performance	3	1737	September 1, 2009
Dynamic memory allocation CUDA Programming and Performance	4	2942	July 11, 2007

errors while usin shared memory error: expression must have constant value in line 16n 17

Related topics