You can not use indexing with registers, nor padding. Turn off the padding and don’t use an array of floats. As an alternative you can keep the array and place your struct in shared memory, i.e. shared my_vec vec0;.
Whatever you do don’t use padding on this memeory level, this is no sse code.
Hold on, what?!?!?! This means that I can’t use short arrays at all practically? How on earth is one supposed to do programming on this??? Everything has to be typed manually every time? Shouldn’t these kind of things be something that the compiler has to worry about? And shouldn’t this be one of the reasons that the loop-unrolling-pragma exists?
I mean this seems exactly the kind of things that programming languages were created for, no? To abstract away these stupidities. I guess in this sense the real culprit is C being too low-level, but I think this particular problem could be optimized by the compiler…
Hmm, checking the Programming Guide (v 1.1) says the following:
"An automatic variable declared in device code without any of these qualifiers generally resides in a register. However in some cases the compiler might choose
place it in local memory. This is often the case for large structures or arrays that would consume too much register space, and arrays for which the compiler cannot determine that they are indexed with constant quantities."
This would seem to suggest that one can use arrays, if the compiler is smart enough in the indexing - so I guess only the disassembly will tell the final truth… :)