After I went through a bunch of materials about CUD memory type and usage, I even became more confused.
Where are local variables stored? I’m wondering whether this is correct:
if registers available
store in register
else if L1 cache available
store in L1 cache
else
store in global memory
And how should I decide how to manage local variables? Is shared memory the first choice?