hi,everyone

I got the following code and the value of “rhs_h(1)” cannot be changed by the kernel.

module mod_cuda_test

use cudafor

implicit none

integer,parameter::kind_int=4

integer,parameter::kind_real=4

contains

attributes(global) subroutine calc_I3(rhs_d)

implicit none

real(kind_real),intent(inout)::rhs_d(:)

rhs_d(1)=3.0

end subroutine calc_I3

end module mod_cuda_test

program test_kernel

use cudafor

use mod_cuda_test

implicit none

real(kind_real),allocatable::rhs_h(:)

real(kind_real),allocatable,device::rhs_d(:)

allocate(rhs_d(-1:10))

allocate(rhs_h(-1:10))

rhs_h=2.0

rhs_d=rhs_h

call calc_I3<<<1>>>(rhs_d)

rhs_h=rhs_d

write(*,*),“rhs_d(1) is”,rhs_h(1)

end program

when i run the code on Geforce 9400 (both driver and runtime version are 3.2) with pgi 11.8, rhs_h(1) is still 2.0.

But if i change

allocate(rhs_d(-1:10))

allocate(rhs_h(-1:10))

to

allocate(rhs_d(1:10))

allocate(rhs_h(1:10))

rhs_h(1) is changed to 3.0

Anybody know if cuda fortran donot support an array of negative index???

Hi Steve,

Anybody know if cuda fortran donot support an array of negative index???

Negative indices are supported, however this looks like a compiler error when passing in a array with a negative lower bound as a argument. I sent a report to our engineers (TPR#18545) for further investigation.

The work around is to declare “rhs_d” as a module variable.

```
% cat neg2.cuf
module mod_cuda_test
use cudafor
implicit none
integer,parameter::kind_int=4
integer,parameter::kind_real=4
real(kind_real),allocatable,device::rhs_d(:)
contains
attributes(global) subroutine calc_I3()
implicit none
rhs_d(1)=3.0
end subroutine calc_I3
end module mod_cuda_test
program test_kernel
use cudafor
use mod_cuda_test
implicit none
real(kind_real),allocatable::rhs_h(:)
allocate(rhs_d(-1:10))
allocate(rhs_h(-1:10))
rhs_h=2.0
rhs_d=rhs_h
call calc_I3<<<1>>>()
rhs_h=rhs_d
write(*,*),"rhs_d(1) is",rhs_h(1)
end program
% pgf90 neg2.cuf -V12.3 ; a.out
rhs_d(1) is 3.000000
```

Thanks,

Mat

thanks Mat.

But i still cannot get the right result when rhs_d is changed to a module device variable. Fortunatly if the index range of rhs_d is declared explicitly in the definition of kernel function, i can see it takes effect. Below is my code:

module mod_cuda_test

use cudafor

implicit none

integer,parameter::kind_int=4

integer,parameter::kind_real=4

contains

attributes(global) subroutine calc_I3(rhs_d,n1,n2)

implicit none

integer(kind_int),value::n1,n2

**real(kind_real),intent(inout)::rhs_d(n1:n2)**

rhs_d(1)=3.0

end subroutine calc_I3

end module mod_cuda_test

program test_kernel

use cudafor

use mod_cuda_test

implicit none

real(kind_real),allocatable::rhs_h(:)

real(kind_real),allocatable,device::rhs_d(:)

integer(kind_int)::n1,n2

n1=-1

n2=10

allocate(rhs_d(n1:n2))

allocate(rhs_h(n1:n2))

rhs_h=2.0

rhs_d=rhs_h

call calc_I3<<<1>>>(rhs_d,n1,n2)

rhs_h=rhs_d

write(*,*),“after call cacl_I”

write(*,*),“rhs_d(1) is”,rhs_h(1)

end program

Hi Steve,

FYI, TPR#18545 has been closed. The engineer investigating the issue determined that the original program was in error. He notes:

Assumed-shape dummy arrays do not assume the lower bound of the actual argument. The assumed-shape array gets the extent (the shape) from the actual argument, but the lower bound is defined by the declaration. The correct program would declare the argument as “real(kind_real),intent(inout)::rhs_d(-1:)”

I quote from “Fortran 95/2003 explained” page 100, section 6.3 “Assumed-shape arrays” “When the shape is declared by the dimension clause, each dimension has the form: [lower-bound]: where lower-bound is an integer expression that may depend on module data or the other arguments (…). If lower-bound is omitted, the default value is 1. Note that it is the shape that is passed, and not the upper and lower bounds.”

So your solution is of passing both the lower and upper bounds is correct.