Registers as Arguments to __device__ Functions


In my kernel I use about four registers and a shared memory array

to shift data between global and local(shared and register) memory.

However the transfer can get complex (it depends on MACRO variables)

and I use it in a lot of places in my kernel. Thus I want to write a device

function for such transfers.

Can I use registers as variables in my device function? Is it

possible to use reference to registers? If I can not use reference to registers,

then can I transfer data to those registers from the device function?

I would like something like this:

__device__ void Transfer_to_Registers(float &reg_1, float &reg_2, float &reg_3, float* Global_Mem_Array)


    //Transfer Global_Mem_Array to reg_1, reg_2 and reg_3, distributing the data according to thread indexes


Is this legal? If it’s legal does it have overhead that makes it inadvisable?

A quick update:

The Transfer_to_Registers function works just fine, without any noticeable time delay

(I declared the function device forceinline but I don’t know if that makes a difference).

I hope I’m not missing anything.