What is the type of the loop index after a collapse

JudicaA_l_Grasset · May 20, 2019, 1:06pm

Hello,

I recently had some problem when collapsing loops with OpenACC. The code looks like that:

integer, intent(in) :: imax,jmax,kmax,lmax
integer :: i,j,k,l
!$acc parallel loops collapse(4)
do i=1,imax
  do j=1,jmax
    do k=1,kmax
      do l=1,lmax

The code works fine with small value of imax/jmax/kmax/lmax but when they increased sufficiently enough so that imaxjmaxkmax*lamx > 2**31, the code give me wrong result (but doesn’t crash).
If I change the type of i,j,k,l to 64bits integer, then the code works.
But I thing that’s surprising. I would have thought that if the loop index overflow it would be the same when multiplying all the max value. But it seems like the type of the new max is of the same type than the new index.

So I would like to know if there is documentation somewhere on how the compiler decide the type of the new index and new max of the new collapsed loop.

MatColgrove · May 20, 2019, 4:14pm

Hi JudicaÃ«l Grasset,

It’s an interesting issue. Since the compiler wouldn’t know that the product of the loop bounds would overflow a 32-bit integer, it would have to assume that it does and thus use 64-bit indexing whenever a “collapse” clause is used. However, this would have a negative impact on performance given indices would now use 2 registers instead of 1. Using more registers reduces the occupancy and thus reducing the available resources available for compute.

I’ve added a request for enhancement (TPR#27179) to see what can be done. Hopefully our engineers can come up with a solution that helps this case without degrading performance.

So I would like to know if there is documentation somewhere on how the compiler decide the type of the new index and new max of the new collapsed loop.

No, this is more of an implementation detail that’s not defined by the standard.

-Mat

JudicaA_l_Grasset · May 21, 2019, 9:04am

Hi mkcolg,

So if I understood correctly, that means there is no standard way of doing it at the moment, and while my code happens to work when I set i,j,k,l to 64bits this time, it’s possible that it will not work with a future release of the compiler ?

MatColgrove · May 21, 2019, 2:49pm

my code happens to work when I set i,j,k,l to 64bits this time,

It works because the compiler will use the same data type for the collapsed index as the original index variable. So if you use a 64-bit index, then the compiler will use a 64-bit index for the collapsed loop.

it’s possible that it will not work with a future release of the compiler

The RFE is to have the compiler implicitly promote a 32-bit index (as defined by the program) to a 64-bit index for the collapsed loop. So if you use a 64-bit index now, the program behavior would not change in the future.

Note that the flag “-i8” will implicitly promote the default kind of “integer” from “integer(4)” to “integer(8)” so you may consider using this flag in place of altering the program. “-i8” does not promote integers where the kind is defined so if elsewhere you do explicitly want kind=4, “-i8” will not promote variables declared “integer(4)”.

Hope this helps,
Mat

JudicaA_l_Grasset · May 22, 2019, 8:14am

Ok, thanks for the answer and thanks for the RFE.