I’m a complete CUDA novice. Am trying to run a little test code to figure out the workings of CUDA. I have a subroutine:
attributes(global) subroutine time_step(a,v,dt,dx,ntimes,simlength)
use cudafor
real8, intent(inout) :: a(simlength)
real8, intent(in) :: v, dt, dx
integer :: ntimes
integer, value :: simlength
integer :: i, n
real*8 :: aux(250)
real*8 :: decl
decl=v*dt/dx
aux=0.
i = blockDim%x*(blockIdx%x-1)+threadIdx%x
do n = 1,ntimes
if (i>1 .and. i<=simlength) aux(i)=a(i)-decl*(a(i)-a(i-1))
! this is really what’s going on:
! if (i>1 .and. i<=simlength) aux(i)=a(i-1)
a(i)=aux(i)
enddo
end subroutine time_step
Three problems: 1) Most importantly, this code should just be moving the contents of array “a” one step to the right for each iteration in the loop. It does this for most part, but it drops a couple of numbers. e.g., after 40 iterations, where the data should look like 17, 18, 19, 20 … it looks like 17, 19, 20 … (By the way, I have run this code as you see it here, and with the simple"this is what’s going on line" replacing the line above it; same results.) 2)I want “aux” to be an automatic array. Tried making it shared, but then the code won’t run. Dies with “unspecified failure to launch”. 3) I really don’t want to define a separate “aux” array. I want “a” to be something like a(:,2). Then I would update a(:,2) in every iteration and copy it to a(:,1). Doing this in a naive way produces a code that will compile but not run. Anyone know what to do about these things? Much obliged.