Structure alignment bug related to calling build-in functions

I am seeing very strange behaviour in my kernel, that seems to be triggered by calling a built in function (in this case ‘sin’) from my code. I have narrowed it down to a very simple test case (just a hacked-up version of the VectorAdd sample)… I just have a simple kernel that creates a local structure writes a couple of values to it, then reads them back into the output data buffer:

typedef struct _MyStruct {		 

	float   a;					  

	float   b;					  

	float   c;					  

	float   d;					  

	float4	vecVal;

	float4	anotherVecVal;

	

} MyStruct;

__kernel void VectorAdd(		__global const float* a,  __global const float* b,  __global float* c, int	 iNumElements)

{

	MyStruct structData;

	

	structData.vecVal = (0.0f);

	structData.d		 = 0.5f;  

	

	c[0] = structData.vecVal.x;

	c[1] = structData.vecVal.y;

	c[2] = structData.vecVal.z;

	c[3] = structData.vecVal.w;

}

This code works fine, the values that are written into the buffer are all zero. I can actually print the addresses of the various local variables (by returning them in the output buffer and inspecting them on the CPU) and they look fine (the float4 is aligned on 16-byte binary, which I’m assuming is important).

&structData.d=12

&structData.vecVal=16

&structData.anotherVecVal=32

&structData=0

However if I add a simple sin statement to my kernel bad things happen:

MyStruct structData;

	

	structData.vecVal = (0.0f);

	structData.d		 = 0.5f;	   

	

	float  sinResult = sin(3.141592653589f);

Random garbage ends up in structData.vecVal, and now if I look at my addresses everything has been shifted by 56 bytes and the float4s are no longer aligned:

&structData.d=68

&structData.vecVal=72

&structData.anotherVecVal=88

&structData=56

I’m guessing using the built-ins adds some local constant and such to the local memory, but as these are not 16-byte aligned they can cause trouble with other stuff in local mem ? Or maybe something more complicated is happening I’m not grasping. I am using the latest (non-beta) SDK and drivers on 32-bit windows XP, with a Quadro Plex 5800.

Any ideas ? Is there a bug database I can submit this kind of thing to ? I’ve attached my test case (should be able unzip into OpenCL\src\oclVectorAdd and run).

Thanks

Gareth Morgan
AlignmentError.zip (4.13 KB)

Incidentally I verified this only happens with structures. If I remove MyStruct and replace it with six separate local variables they are all correctly aligned.

And one more thing adding aligned attribute to that struct has no affect. Even if the struct is defined like this:

typedef struct _MyStruct   

{		 

	float   a;					  

	float   b;					  

	float   c;					  

	float   d;					  

	float4	vecVal;

	float4	anotherVecVal;

	

} __attribute__ ((aligned(16))) MyStruct;

Then &structData is still 56