Help on Error: Unaligned Memory access not supported

In my kernel, I have the a code segment as follows:

extern shared char array;
//global void mvBD(ValueType** A, ValueType* x, ValueType* y,IndexType matBlkSize, IndexType matNumBlk) //#2
global void mvBD(ValueType** A, ValueType* x, ValueType* y,IndexType matNumBlk) //#3
const IndexType matBlkSize=3;
unsigned int matBlkIdx=blockDim.xblockIdx.x+threadIdx.x;
unsigned int offset=matBlkIdx
matBlkSize, offset2=blockDim.xblockIdx.xmatBlkSize;
//the following two lines are for allocating shared memory dynamically.
ValueType** As=(ValueType**)array;
ValueType* ys=(ValueType*)(array+matBlkSizematBlkSizesizeof(ValueType

ValueType xs[3];


ys[i]+=As[i*matBlkSize+j][matBlkIdx]*xs[j]; //#1

the line #1 gives me a compile error: Unaligned Memory Access not supported. But If I treat matBlkSize as a function parameter and pass its value from outside, the kernel runs correctly. But I do not understand why it is different from this two treatment. Anyone can explain it for me? Thank you.

You’re on a 64-bit platform and IndexType is a typedef to a 32-bit type, right?

array is of type char, so there is no guarantee it will be aligned for larger types, including pointers. I believe data in shared memory should always be aligned, although I can’t find a clear confirmation of this in the Programming Guide.

Also, kernel parameters are passed through shared memory.

So my guess is that adding one extra 32-bit parameter reestablishes alignment of shared memory to 64-bit, so the code works by accident.

Solutions would be to either align the array pointer yourself with pointer arithmetic, or better to declare it as an array of some 64-bit/pointer type, then make sure you only cast it to a smaller or equal type.