When I read Driver API(3.3) of programming interface of the CUDA C Programming Guide, the kernel execution is still not very clear to me. I have two questions - all about the alignment requirements.
In the sample code:
#define ALIGN_UP(offset, alignment)
(offset) = ((offset) + (alignment) - 1) & ~((alignment) - 1)
I really don’t understant what this macro definition is intend to do. I have tried some pairs of offset and alignment and I found that the equation did nothing. In my opinion, it is enough to use “offset += sizeof(*)” after specifying the parameter.
In the Table-B-1, the alignments of char3,int3 and float3 are always same with those of char, int and float, which is strange because the alignments of others like intX is X times of that of int. I guess when the X is odd, the values are all same?? And who can tell me the structure of intX?