Hi,
I have two questions:
- does coalesced access require a __syncthreads() call right before access?
- consider a kernel that takes two arrays as input: float *a, float *b. If the kernel does simply
a[threadIdx.x] = a[threadIdx.x]>b[threadIdx.x]?1.0:0.0;
will the access/write be coalesced?