256B aligned address in global memory?

gepolv · April 19, 2011, 4:23pm

Hi, all,
From NVIDIA CUDA programming guide 3.0 section 5.3.2.1.1

“Any address of a variable residing in global memory or returned by one of the memory allocation routines from the driver or runtime API is always aligned to at least 256 bytes.”

It basically says addresses in global memory are aligned 256B while memory coalescing says “The size of a memory transaction can be: 32B, 64B or 128B”. So if memory access is coalesced as 64B, how can it access a 256B aligned address?

For example:
in global memory:

64B | 64B | 64B | 64B ------------->256B aligned address
now I have a coalesced memory access to above 2nd 64B, since address in global memory has been coalesced at the 1st 64B, it seems there is no way to access the 2nd 64B.

Would anyone like to clear my doubts?

Thanks
Deryk

avidday · April 19, 2011, 4:32pm

The first statement means that the whole allocation (ie the starting address) is aligned to a 256 byte page boundary. That doesn’t contradict coalescing rules at all.

Topic		Replies	Views
Global memory alignment and coalescing CUDA 1.1 compatible CUDA Programming and Performance	2	1700	October 20, 2008
Memory access - data alignment How does the data alignment in opencl work? CUDA Programming and Performance	0	4689	July 6, 2010
Coalesced Access to Global Memory CUDA Programming and Performance	2	1870	April 13, 2012
Problem withGlobal/Device memory alignment in CUDA CUDA Programming and Performance	0	1306	August 25, 2008
Loading global memory into shared memory: alignment? CUDA Programming and Performance	2	842	December 8, 2017
how to get memory fetch size for coalesced reads from global memory? CUDA Programming and Performance	0	458	December 15, 2016
Memory alignment when using cudamalloc? CUDA Programming and Performance	2	8809	May 28, 2010
Coalescing memory accesses Need help with coalescing CUDA Programming and Performance	2	1164	March 30, 2009
Alignement requirement CUDA Programming and Performance	1	3321	August 16, 2009
Alignment Requirement Single instructions CUDA Programming and Performance	1	3667	October 11, 2007

256B aligned address in global memory?

Related topics