Register usage for vector types

Hello,

Do I understand correctly register consumption for the following declarations inside kernels:

ushort2 var1;            consumes ONE  32-bit register?

unsigned short var2[2];  consumes TWO  32-bit registers?

If above definitions were in shared/global memory, either of them would consume total 32 bits of memory…

Thank you

You might be looking at compiler optimization in the first case. If you don’t use all the elements of a vector type, the compiler can and will optimize the unused elements away. This can lead to strange side effects, including confusing register usage and uncoalesced memory access.