Several "extern __shared__" statements on a code


I’m stuck with a code where I want to save space for two different arrays of shared memory in launch time.

For example, for function “test”, I’m searching for something like…

// Caller function

test<<<blocksPerGrid, threadsPerBlock, sharedMemorySpace>>>();

// Test function

__global__ static void test()


extern __shared__ char v1[]; // Half sharedMemorySpace

extern __shared__ int v2[]; // The other half sharedMemorySpace


Is there a way to do it?


This is from the programming guide:

When declaring a variable in shared memory as an external array such as

extern __shared__ float shared[];

the size of the array is determined at launch time (see Section B.12). All variables

declared in this fashion, start at the same address in memory, so that the layout of

the variables in the array must be explicitly managed through offsets. For example, if

one wants the equivalent of

short array0[128];

float array1[64];

int array2[256];

in dynamically allocated shared memory, one could declare and initialize the arrays

the following way:

extern __shared__ char array[];

__device__ void func() // __device__ or __global__ function


short* array0 = (short*)array;

float* array1 = (float*)&array0[128];

int* array2 = (int*)&array1[64];


What you have will work, and it is discussed in the programming guide (Appendix B.2.3 in the Cuda 2.3 version of the guide). The only caveat is that both occupy the same allocated block of memory (so the shared memory request should be the combined sizes of v1 & v2), and the second array requires an offset in its indexing equal to the length of v1.