How does compiler decide where to put local variables within a kernel or a device function?

After I went through a bunch of materials about CUD memory type and usage, I even became more confused.

Where are local variables stored? I’m wondering whether this is correct:

if registers available
store in register
else if L1 cache available
store in L1 cache
store in global memory

And how should I decide how to manage local variables? Is shared memory the first choice?