question about coalesced float3, say i have a global array of float3s
and in device code, I access them accordingly dimension by dimension using:
int temp_x = g_array[tid].x;
int temp_y = g_array[tid].y;
int temp_z = g_array[tid].z;
Is this a coalesced read?
Firstly g_array[tid].x means the complete float3 is loaded from global memory and then it’s x-member is taken. Theoretically this would result in three reads of the whole float3 in your case, but the compiler will probably just load it once and reuse it.
Secondly float3 loads are never coalesced as the struct is 12 bytes long and thus doesn’t meet the alignment requirements. (float2 with 8 bytes and float4 with 16 bytes work though.)
To rectify this you either have to pad the datatype or split the components into multiple arrays.
Thanks! really useful info.