How to manage Register Memory in Fortran?

I need to manage the register memory in fortran. I don´t know how to do it.

I´m have started to program with cuda and I will use this memory in order to set local variables for each thread. I know that using C it is posible, but i do not know how to do it in Fortran.

Here I will show why I need the register variable. If there is another way to do it I will appreciate.

attributes(global) subroutine trotter(N,p,a)

integer :: i,tx,idev
integer, value :: N
complex(8),device :: p(N),a(N,3)

complex(8) :: aux1,aux2 !–> I need this variable to be at each thread

tx=threadidx%x
idev=(blockidx%x-1)51210

do i=1,9,2 !-> I’m Trying to make this do statement in each thread

aux1=p(idev+i)
aux2=p(idev+i+1)

p(idev+i) =a(idev+i,1)*aux1+a(idev+i,2)*aux2
p(idev+i+1)=a(idev+i,1)*aux2+a(idev+i,2)*aux1

end do
!!! I cut the rest of the program but this don´t change anything.

end subroutine

I´m newbie in programing, by this reason perhaps the program is not efficient at all and I don’t know if it is working properly.

Thanks Very much.

I don’t fully understand what your code is trying to do, so I cannot comment much on it. But as far as registers usage concerned: CUDA compilers will usually try to put each of your kernel local variables into registers. You should always check the number of registers used after your code is compiled, as you’d have to adjust the block size to this number - if too many registers used, then you’d have to decrease the number of threads in block, as otherwise your kernel won’t get launched. But in that case, and in general, you should try to utilize shared memory too - it is as fast as registers, it makes it possible to decrease the registers usage by your kernel, and sometimes it makes it easier to coalesce global memory accesses than what is the case when you read data from global memory directly into registers.

thanks very much for your responce. It was helpfull now I have to try.