Does shared memory passes create register or local memory in device function calls?

I want to use a simple trick to reduce the global memory traffic in my application by using shared memory. I first read the global memory data into the shared memory and then use them within the kernel. However, I need to pass some shared memory data to a device function for more processing.
See a sample code below:

__device__ int3 Dev_Func( float3 posi , float dx  )
	int3 ind;
	ind.x = posi.x /dx;
	ind.y = posi.y /dx;
	ind.z = posi.z /dx;
	return ind;

__global__ void Kern( float3 *pos , float3 *vel)
	extern __shared__ float3 sh_pos[];
	unsigned int idx = blockIdx.x * blockDim.x + threadIdx.x;

	sh_pos[threadIdx.x] = pos[idx];

	// .... some code here	//

	// calling device function
	int3 prtc_ind = Dev_Func( sh_pos[threadIdx.x] , (float)0.01 );
	// .... some code here	//

I use sh_pos for storing global data in the shared memory and then, I pass sh_pos[threadIdx.x] to Dev_Func. Does the compiler creates a new copy of sh_pos[threadIdx.x] in the register or local memory when executing Dev_Func? If so, how is the most flexible way to avoid such a situation? I have 4 different device functions calls in this kernel that use shared memory data like the case I showed in the above code.

Pass the pointer instead?

int3 prtc_ind = Dev_Func( sh_pos+threadIdx.x , (float)0.01 );

Obviously this affects your device function code slightly as well.