Shared Memory Address Space

Hi,

Can anyone provide some sample code containing multiple types in a single shared variable.

The manual suggests creating a number of pointers offset at appropriate intervals i.e. pointing to the next memory space adjacent to the previously defined pointer.

I seem to be having a few problems grasping what appears to be a simple subject.

Here’s what I’ve got at the moment within my kernel

extern __shared__ char sharedData[];

	char *sharedSequenceString = sharedData; //character array of length stringLength

	char *sharedResidue = &sharedSequenceString[stringLength]; //single character

	int *sharedPosition = (int*)&sharedResidue[1]; //single integer

	int *sharedIndexedSeqeuence = &sharedPosition[1]; //integer array

Thank you in advance for your help

That can’t work, doubly. sharedPosition is one byte into sharedResidue, thus (probably) not aligned, so you can not use it as an int pointer.

But even if you fixed that (e.g. by first casting to int and then adding 1) it still would not work in general because sharedResidue might not be aligned sufficiently.

A simple and portable way to avoid it is to declare things as the largest type (more precisely the type that requires the larges alignment) and only casting to smaller types.

Note that nothing of this is specific to GPU code, it also applies to any portable C code - it even applies to normal x86 CPUs, while it will not cause crashes there, doing it wrong will still result in slower code.

EDIT: to possibly save you some reading the following hints:

if

char *a;

int b;

then

&a[b] == &b[a] == a + b

and

((int *)a)+b == (int *)(a + b * sizeof(int))

cheers for your help, all the examples I found using shared memory only contained one type of variable thus no worries for concern. The explanation in the manual wasn’t particually helpful. I knew something was out of alignment :D

Isn’t it also possible to use the align attribute?

Yes, but that is not portable (only works with nvcc). Maybe not relevant to CUDA developers but if it is the same effort I prefer a solution that works everywhere.