Structure of mixed types, Coalescing not possible?

Hi,

I’m having problems getting my programs achieving coalesced memory access when I’m using structures of mixed types. For example:

[codebox]typedef struct align(8)

{

float i;

float w;

} NonMixedStruct;[/codebox]

gives me coalesced memory access while:

[codebox]typedef struct align(8)

{

int i;

float w;

} MixedStruct;[/codebox]

won’t.

Is this a know problem? I cannot find any other entries regarding this issue searching the forum. There is of course an easy workaround using __int_as_float and __float_as_int but this isn’t really a beautiful solution.

Anybody got a clue what’s going on?

Best regards,

Erik

Hi again,

I kind of solved the problem using an anonymous union:

[codebox]typedef struct align(8)

{

union{ 

float f;

int i;

};

float w;	

} UnionStruct;[/codebox]

Still no the beautiful solution I was looking for, and I’m still puzzled why this isn’t documented.

Cheers!

/Erik

I can’t imagine why it would make a difference, 4 bytes are 4 bytes, how you interpret them should have no influence on the transfer (as you proved with the union). Maybe there’s something wrong with how you read the data in your kernel? Care to post the code?

Are you always accessing both components? Sometimes the compiler will optimize out one of the loads if you’re not using the result.

You can sometimes force the whole structure to be loaded using the “volatile” keyword.

Hi again,

Sorry about the incredibly late reply. As I wrote earlier I kind of solved the problem using my union-workaround and then I ditched the problem altogether in favor of another solution. Well, now I’m revisiting my old ideas and I’m still having this problem but it manifested itself in a funny way.

This time I used the float2 datatype instead of my own union type. If I write ordinary floats to my float2 my memory access is coalesced. For example:

In this case I can achieve coalesced global memory access:
float2 det;
det.x=0.0f;
det.y=0.0f;
global_det_array[begin+tx]=det;

This also yields coalesced global memory access:
float2 det;
det.x=0.0f;
det.y=__int_as_float(12);
global_det_array[begin+tx]=det;

However, if I use an integer variable instead of ‘12’ in my __int_to_float() my memory access is no longer coalesced!
int i;
float2 det;
for(i=0;i<100;i++)
{
det.x=0.0f;
det.y=__int_as_float(i);
global_det_array[begin+tx]=det;
}

I hope you can understand my rough dummy-code above. Simon and Big_Mac I’m actually just using the structure to write coalesced to memory. In my current kernel I never read any results from this array. Basically I’m just storing data from this kernel so that another kernel can access it later.

Any ideas?

/Erik