When I pass a value, such as “int lengthArray”, int a kernel call, does it store this value in global memory or in one of the thread’s registers? Example:
[codebox]kernel(int lengthArray)
{
...
}
int main()
{
...
kernel<<<1,1>>>(16);
...
}[/codebox]
How is lengthArray stored? I ask because I am trying to conserve on the number of registers used in each thread and prevent register spilling.
Also, I have another question maybe someone can help me with. When I am inside the kernel call, if I declare shared int smem[2], does that declare a shared memory array of 2 integers for EACH thread in a block, or is this allocated to the entire block? I know shared memory has the scope of the entire block, but then why redundantly execute this instruction if it only needs to be executed by one thread in a thread block? The SDK examples tend to make it seem like that even if every thread in a thread block calls shared int smem[2], that only 2 integers in shared memory are allocated across the entirety of the block. Is this true or false?