CUDA Fortran - Align attribute for allocatable arrays


I am looking for a feature in CUDA Fortran similar to CUDA C for aligning data structures so as to comply with GPU global memory coalescing. As an example quoted in CUDA C programming guide, we have,

struct align(16)
{ float x;
float y;
float z;

As per documentation, any access to data residing in global memory compiles to a single global memory instruction if and only if the size of the data type is 1, 2, 4, 8, or 16 bytes and the data is naturally aligned (i.e. its address is a multiple of that size).

I have a 3d data structure similar to above. How do I align them in CUDa Fortran to satisfy the above criteria? Does the align attribute of “allocate” API serve this purpose?

Hi Pradeep Rao,

Fortran User defined types and C structs will naturally align on 16-bytes when the size is 16-bytes or larger. So the solution here is to add a fourth float or REAL for padding.

  • Mat