I’m stuck with a code where I want to save space for two different arrays of shared memory in launch time.
For example, for function “test”, I’m searching for something like…
// Caller function
test<<<blocksPerGrid, threadsPerBlock, sharedMemorySpace>>>();
// Test function
__global__ static void test()
extern __shared__ char v1; // Half sharedMemorySpace
extern __shared__ int v2; // The other half sharedMemorySpace
Is there a way to do it?
This is from the programming guide:
When declaring a variable in shared memory as an external array such as
extern __shared__ float shared;
the size of the array is determined at launch time (see Section B.12). All variables
declared in this fashion, start at the same address in memory, so that the layout of
the variables in the array must be explicitly managed through offsets. For example, if
one wants the equivalent of
in dynamically allocated shared memory, one could declare and initialize the arrays
the following way:
extern __shared__ char array;
__device__ void func() // __device__ or __global__ function
short* array0 = (short*)array;
float* array1 = (float*)&array0;
int* array2 = (int*)&array1;
What you have will work, and it is discussed in the programming guide (Appendix B.2.3 in the Cuda 2.3 version of the guide). The only caveat is that both occupy the same allocated block of memory (so the shared memory request should be the combined sizes of v1 & v2), and the second array requires an offset in its indexing equal to the length of v1.