In our application, the acc loop range is not fixed. In order to avoid the copy of the range loop from CPU to GPU, I tried to calculate the loop range on GPU with ‘acc serial’. This is the development background of the following demo code.
In the ‘!$acc loop’ line, I found it produces correct results without ‘collapse’. With ‘collapse’, all the elements of the array ‘a’ are still zeros. Is this a bug or ‘feature’ of openacc? Could you explain to me what is going on?
File Edit Options Buffers Tools F90 Help program main call sub1 contains subroutine sub1() real, dimension(3, 3, 3):: a !$acc declare create(a) integer:: imax = 0, jmax = 0, kmax = 0 !$acc declare create(imax, jmax, kmax) integer:: i, j, k !---------------------------------- a = 0 !$acc update device(a) !$acc serial present(imax) imax = 2 jmax = 2 kmax = 2 !$acc end serial ! The following loop produces expected results without 'collapse' !$acc parallel loop collapse(3) present(a, imax, jmax, kmax) do i = 1, imax do j = 1, jmax do k = 1, kmax a(i,j,k) = -2 end do end do end do !$acc update host(a) write(*,*)'sub1 a = ',a end subroutine sub1 end program main