I have struct like this
typedef struct __align__(16){
double x;
double y;
double z;
double u;
double v;
double w;
double h;
}vec_space;
each threads access the whole struct like this…
unsigned int indx = blockIdx.x*blockDim.x+threadIdx.x; // 1 register
Y[indx].x = ((vec_space*)gpu_space)[indx].x;
Y[indx].y = ((vec_space*)gpu_space)[indx].y;
Y[indx].z = ((vec_space*)gpu_space)[indx].z;
Y[indx].u = ((vec_space*)gpu_space)[indx].u;
Y[indx].v = ((vec_space*)gpu_space)[indx].v;
Y[indx].w = ((vec_space*)gpu_space)[indx].w;
double h = ((vec_space*)gpu_space)[indx].h;
where Y is a struct in shared memory… like this
typedef struct __align__(16){
double x;
double y;
double z;
double u;
double v;
double w;
}shspace;
__shared__ shspace Y[8*6*Thread_Block_size];
no two threads access the same elements…
are these reads coalesced :unsure: ? ( they are double reads to shared memory I am aware about 2 wa abnk conflicts but I have 100 + flops per thread so they wont matter much…)… or can anyone help me in coming witha better way to do this ?
thansk for all the help :)