Hi,
I try to use OpenACC to port a fortran 90 code to GPU. The code uses very complex data construction. A typical do-loop likes
!$acc data copy(tmp(:,:,:)) copyin(var(1:ni,1:nj,1:nk), el%metric%detJ(1:ni,1:nj,1:nk))
!$ACC PARALLEL LOOP
do k=1,nk
do j=1,nj
do i=1,ni
tmp(i,j,k) = var(i,j,k)*el%metric%detJ(i,j,k)
enddo
enddo
enddo
!$acc end data
where variables"tmp", “var”, and el%metric%detJ(1:ni,1:nj,1:nk) like
real(kind=dp), allocatable, dimension(:,:,:) :: tmp, var
real(kind=dp), allocatable, dimension(:,:,:) :: detJ
When use the PGI v18.10.0 to compile the code, I got some messages likes
1546, Generating copyin(el%metric%detj(1:ni,1:nj,1:nk),var(:ni,:nj,:nk))
Generating copy(tmp(1:ni,1:nj,1:nk))
1547, Accelerator kernel generated
Generating Tesla code
1548, !$acc loop gang ! blockidx%x
1549, !$acc loop seq
1550, !$acc loop vector(128) ! threadidx%x
1547, Generating implicit copy(el)
1549, Loop is parallelizable
1550, Loop is parallelizable
and it crashed when run it on a P100 node with following error
1546: data region reached 1 time
1546: data copyin transfers: 5
device time(us): total=57 max=19 min=4 avg=11
1547: data region reached 1 time
1547: data copyin transfers: 2
device time(us): total=11 max=7 min=4 avg=5
1547: compute region reached 1 time
1547: kernel launched 1 time
grid: [1] block: [128]
device time(us): total=0 max=0 min=0 avg=0
call to cuMemFreeHost returned error 700: Illegal address during kernel execution
Is the error due to the data type of “el”? Why does the compiler
Generating implicit copy(el)
?
…
type(compElement),intent(inout),target :: el
…
Thanks for help!