What is the intended use of the built-in vector types, such as int2, float4, etc? Why are they useful for memory coalescing and when are they preferable to a standard C array?
This is an interesting question that I hope gets a full airing and explanation. float2 is often used for cuFloatComplex, but does it really gain the user anything?
MMB
The built-in types are treated like any other user-defined struct(AFAIK). They have defined alignment which might improve load/store performance(64/128bit load/store instructions used).
struct __builtin_align__(8) float2
{
float x, y;
};
When you say they have defined alignment, what exactly does that mean? Does that mean they will be forced to exist so that say, a 32 bit float does not straddle two 32bit areas in the memory, and require a 64 bit read? And that this will be forced by the compiler no matter what? If that’s what it means, then I can see why it would be useful.
Exactly that - the float2 will always start at a memory address which is divisible by 8 (in the code sample given). You have to do similar things for variables you wish to crunch with SSE. Such configurations are helpful for memory controllers.