error in accelerator region

this is my code

program exper
implicit none
real ::a(1,3),area(1,20),lg,wi,br,s

integer i
a(1,1)=5.2
a(1,2)=8.1
a(1,3)=5.4
!$acc region
do i=1,20
lg=a(1)
wi=a(2)
br=a(3) 
s=(lg+wi+br)/3;
area(i)=(s*lg*wi*br);
a=a+0.4
end do
!$acc end region
print*,area
end program exper

when i run without “acc region” it runs correctly but when i use “acc region”
it is showing 0.0000 in all values of array “area”

One issue is you have ‘a’ and ‘area’ as 2D arrays and yet:

lg=a(1) 
wi=a(2) 
br=a(3)
...
area(i)

I’m not sure what the GPU compiler will do then, as I’m not sure that behavior is defined. If nothing else, the compiler generates warnings. If you change to:

real ::a(3),area(20),lg,wi,br,s

then the code works under CPU or ta=nvidia.

Frankly, I’m a bit surprised the CPU code works. Maybe the fact that a 1xN array only occupies N places in memory saves you?

ok…thanks for suggestion

i want one more information that can i call a subroutine within a cuda kernel subroutine

i want one more information that can i call a subroutine within a cuda kernel subroutine

No, you can not call subroutines from within compute kernels. Subroutines need to be inlined, either manually or automatically by the compiler via the flags “-Minline” or “-Mipa=inline”.

  • Mat

ok…this is my code

attributes (global)subroutine multi(b,s)
integer b(20),j,i,s(20),a(10)
 p = threadidx%x-1
 q = blockidx%x-1
 
 s(p+q*4+1)=b(p+q*4+1)*2

 call syncthreads() 
 end subroutine multi
program exper
use cudafor
implicit none
integer b(20),s(20),x(2,2),y(2,2),z(2,2),i,j
integer, device,allocatable,dimension(:) :: bdev,sdev
type(dim3) :: dimBlock,dimGrid
 z=matmul(x,y)
open(unit=10,file="values.dat")
read(10,*)b(1:20)
close(10)

do i=1,2
do j=1,2
x(i,j)=1
y(i,j)=1
end do
end do

allocate (bdev(20),sdev(20))
bdev=b
 
dimGrid = dim3(5,1,1)
dimBlock = dim3(4,1,1)
call multi <<<dimGrid>>> (bdev,sdev)

s=sdev
deallocate (bdev(20),sdev(20))

end program exper

& i want to use z=matmul(x,y) in subroutine "multi’…
i added “-Minline” in properties…but still not working.

i added “-Minline” in properties…but still not working.

Sorry, I assumed you were asking about calling subroutines within a PGI Accelerator Model compute region since this is what your first code used.

The main problem with your CUDA Fortran program is that it’s missing an interface for your global kernel routine. Please add an explicit interface or put “multi” in a module where an implicit interface will be used. There are other programming issues, but I’ll let you work those out.

i want to use z=matmul(x,y) in subroutine "multi’…

Unfortunately, not all intrinsics, including matmul, have been ported for use within device code. You would need to create a “device” routine in the same module as your “global” routine that performs the matrix multiply.

  • Mat