Coalescing of local arrays

Accelerated Computing CUDA CUDA Programming and Performance

tnipen June 10, 2009, 7:55pm 1

The CUDA programming guide (2.1) states:
“Local memory accesses are always coalesced though since they are per-thread by definition.”

Is this true even when a local variable is an array (i.e. one array for each thread)? What if the index to this local array is a variable such that the compiler cannot know apriori how to interleave the array to ensure coalescing?

Thanks in advance for any clarification.

Topic		Replies	Views
Local faster than global. Why? CUDA Programming and Performance	15	13181	March 20, 2009
How does the compiler lay out local variables in local memory? nvc, nvc++ and nvfortran	1	760	April 30, 2021
How does the compiler lay out local variables in local memory CUDA Programming and Performance	1	1931	April 30, 2021
Coalescence CUDA Programming and Performance	3	845	January 9, 2018
Local vs Global memory is local memory access always coalesced ? CUDA Programming and Performance	4	4501	June 30, 2009
How fast is local memory? the doc doesn't say much CUDA Programming and Performance	24	8563	August 20, 2007
Local memory layout and 32-bit words CUDA Programming and Performance cuda	3	1391	February 23, 2022
Coalesced accesses on different arrays CUDA Programming and Performance	2	1826	November 10, 2009
coalesced access to global memory block-wise access vs element-wise access CUDA Programming and Performance	0	1552	March 17, 2010
Local memory performance Using more than 4kb kills it.. why? CUDA Programming and Performance	24	5422	September 6, 2008

Coalescing of local arrays

Related topics