Is these way coalesced access?

742820157 · March 6, 2020, 5:00pm

note “When a warp executes an instruction that accesses global memory, it coalesces the memory accesses of the threads within the warp into one or more of these memory transactions”.

but I have some questions.
1.

__global__ void add(double *a. double *b){
 int i = blockDim.x * blockIdx.x + threadIdx.x;
i = 3 * i;
b[i] = a[i] + a[i + 1] + a[i + 2];
}

can the three accesses(a[i] , a[i + 1] , a[i + 2]) executed with only an instruction? (I mean that is it coalesced access?)
or does the coalesced only exist in the different thread（transverse） of a warp?(no exist in a thread?)

__global__ void add(double *a. double *b){
 int i = blockDim.x * blockIdx.x + threadIdx.x;
b[i] = a[i] + a[i + 10] + a[i + 12];//assuming no out of indeax
}

It may can be the non-coalesced access.
so I change the code to:

__global__ void add(double *a. double *b){
 int i = blockDim.x * blockIdx.x + threadIdx.x;
__shared__ double shareM[3*BLOCK_SIZE]; 
shareM[threadIdx.x] = a[i];
shareM[threadIdx.x + 1] = a[i + 10];
shareM[threadIdx.x + 2] = a[i + 12];
b[i] = shareM[threadIdx.x] + shareM[threadIdx.x + 1] + shareM[threadIdx.x + 2];
}

I write the data to the shared memory from global, then read out. can this way avoid the non-coalesced access for improving the performance?
[/code]

Thank you very much.

Topic		Replies	Views
Memory coalescing in one thread CUDA Programming and Performance	17	16758	March 31, 2011
coalescing problem CUDA Programming and Performance	4	1116	August 8, 2011
confusions about coalesce access CUDA Programming and Performance	3	4932	January 9, 2009
Need some help to understand how to coalesce memory access CUDA Programming and Performance	4	1038	June 30, 2010
1 coalesced global memory load = 16 loads? CUDA Programming and Performance	0	942	January 23, 2011
Is this coalesced access global memory access in for loop and with divergent while loop CUDA Programming and Performance	1	2682	January 5, 2009
Coalesced Memory access related doubt CUDA Programming and Performance	13	2180	December 9, 2010
Need help on non-coalesced access CUDA Programming and Performance	0	1155	May 9, 2009
coalesced access to global memory block-wise access vs element-wise access CUDA Programming and Performance	0	1530	March 17, 2010
Accessing same global memory address within warps CUDA Programming and Performance	4	4365	October 24, 2018

Is these way coalesced access?

Related topics