Nvfortran openacc compilation error related to fortran derived type

Hi,

I used nvfortran-20.9 to compile the following code (nvfortran -acc *.f90 ) and it produces the error:
Module variables used in acc routine need to be in !$acc declare create() - ._dtInit0058

It seems the error is related to the initialization of FaceFluxTest%DiffBb. If I remove the initialization or remove a few variables from the derived type, then there will be no error message.

Did I misuse openacc features? Could you explain what is wrong? Thanks!

module ModAdvance                                                                                                           
  implicit none                                                                                                             
                                                                                                                            
  public                                                                                                                    
                                                                                                                            
  integer, parameter:: MinI = -1, MaxI = 4, MinJ = 1, MaxJ = 1, MinK = 1, MaxK = 1, MaxBlock = 2, nVar=2                    
  type, public :: FaceFluxTest                                                                                              
     integer :: ii                                                                                                          
     integer :: iLeft,  jLeft, kLeft                                                                                        
     integer :: iRight, jRight, kRight                                                                                      
     integer :: iBlockFace                                                                                                  
     real :: r1, r2, r3, r4, r5, r6, r7, r8       ! OK if remove this line                                                                          
     real :: CmaxDt, c1, c2, c3, c4, c5, c6, c7                                                                             
     real :: Area2, AreaX, AreaY, AreaZ, Area                                                                               
     real :: DeltaBnL, DeltaBnR                                                                                             
     real :: DiffBb = 0.0         ! OK if no initialization.                                                                                              
  end type FaceFluxTest                                                                                                     
                                                                                                                            
end module ModAdvance                                                                                                       
!===================================================================================                                        
                                                                                                                            
module ModFace                                                                                                              
  use ModAdvance                                                                                                            
  implicit none                                                                                                             
  !-----------------------------------------------------------------                                                        
  public                                                                                                                    
  contains                                                                                                                  
                                                                                                                            
    subroutine  calc_face_value(iBlock)                                                                                     
      !$acc routine vector                                                                                                  
      integer, intent(in):: iBlock                                                                                          
    !----------------------------------------------------------------                                                       
      type(FaceFluxTest) :: FFT                                                                                             
      integer:: i,j,k,iVar                                                                                                  
      !$acc loop vector collapse(3) private(FFT)                                                                            
      do k = MinK, MaxK; do j = MinJ, MaxJ; do i = MinI, MaxI                                                               
         FFT%ii = k                                                                                                         
      end do; end do; end do                                                                                                
  end subroutine calc_face_value                                                                                            
                                                                                                                            
end module ModFace                                                                                                          
!===================================================================================                                        
                                                                                                                            
program main                                                                                                                
  use ModAdvance                                                                                                            
  use ModFace, ONLY: calc_face_value                                                                                        
                                                                                                                            
  implicit none                                                                                                             
  integer:: iBlock                                                                                                          
  !----------------------------------------------------------------                                                         
                                                                                                                            
  !$acc parallel loop gang                                                                                                  
  do iBlock = 1, MaxBlock                                                                                                   
     call calc_face_value(iBlock)                                                                                           
  end do                                                                                                                    
end program main  

The problem here is that the data type is statically initialized. The initialization value is stored as a static data segment on the host which is not accessible on the device. You’ll need to remove the static initialization of DiffBb in order to get this to work.

-Mat

1 Like

Hi Mat,

Thanks very much for your explanation!

Actually, this is also my first guess until I found it also compiles if I remove a few scalars (for example, the line ‘real :: r1, r2, r3, r4, r5, r6, r7, r8’) from the derived type. I think it will be useful if the compilation error message can be more consistent and informative.

It still fails in the same way for me even if I comment out one or all of “r” variables. Hence I’m unclear why this would this would allow it to compile for you nor does it make much sense.

We can try to determine why, but since the main problem is the static data initialization, I’m not sure it’s worth the effort.

Hi Mat,

Following your suggestions to remove the static data initialization, the code can compile now. But I still can not run the code.

To show the problem clearer, I further simplified the code. The following version can compile and produce correct results:

module ModTest                                                                                                   
  implicit none                                                                                                  
                                                                                                                 
  type:: FaceFluxTest                                                                                            
     integer :: iBlockFace                                                                                       
  end type FaceFluxTest                                                                                          
                                                                                                                 
  integer, parameter:: MinI = -1, MaxI = 4, MinJ = 1, MaxJ = 2, MinK = 1, MaxK = 2, MaxBlock = 2, nVar=2         
  real:: State_VGB(nVar,MinI:MaxI,MinJ:MaxJ,MinK:MaxK,MaxBlock)                                                  
  !$acc declare create(State_VGB)                                                                                
                                                                                                                 
contains                                                                                                         
                                                                                                                 
  subroutine sub_main                                                                                            
  integer:: iBlock                                                                                               
                                                                                                                 
  type(FaceFluxTest) :: FFT                                                                                      
  integer:: i,j,k,iVar                                                                                           
                                                                                                                 
  !----------------------------------------------------------------                                              
                                                                                                                 
  !$acc parallel loop gang present(state_vgb)                                                                    
  do iBlock = 1, MaxBlock                                                                                        
     !$acc loop vector collapse(3) private(FFT)                                                                  
     do k = MinK, MaxK; do j = MinJ, MaxJ; do i = MinI, MaxI                                                     
        FFT%iBlockFace = iBlock                                                                                  
                                                                                                                 
        State_VGB(1,i,j,k,iBlock) = FFT%iBlockFace                                                               
        ! More calculation here                                                                                  
      end do; end do; end do                                                                                     
  end do                                                                                                         
                                                                                                                 
  !$acc update host(state_vgb)                                                                                   
                                                                                                                 
  write(*,*)'state_vgb = ', State_VGB                                                                            
                                                                                                                 
  end subroutine sub_main                                                                                        
                                                                                                                 
end module ModTest                                                                                               
!=========================================================                                                       
                                                                                                                 
program main                                                                                                     
  use ModTest                                                                                                    
                                                                                                                 
  call sub_main                                                                                                  
                                                                                                                 
end program main   

However, if I move the inner loop into a separate subroutine:

module ModTest                                                                                                   
  implicit none                                                                                                  
                                                                                                                 
  type:: FaceFluxTest                                                                                            
     integer :: iBlockFace                                                                                       
  end type FaceFluxTest                                                                                          
                                                                                                                 
  integer, parameter:: MinI = -1, MaxI = 4, MinJ = 1, MaxJ = 2, MinK = 1, MaxK = 2, MaxBlock = 2, nVar=2         
  real:: State_VGB(nVar,MinI:MaxI,MinJ:MaxJ,MinK:MaxK,MaxBlock)                                                  
  !$acc declare create(State_VGB)                                                                                
                                                                                                                 
contains                                                                                                         
                                                                                                                 
  subroutine sub_inner(iBlock)                                                                                   
    !$acc routine vector                                                                                         
    integer, intent(in)::iBlock                                                                                  
    type(FaceFluxTest) :: FFT                                                                                    
    integer:: i,j,k,iVar                                                                                         
                                                                                                                 
    !$acc loop vector collapse(3) private(FFT)                                                                   
    do k = MinK, MaxK; do j = MinJ, MaxJ; do i = MinI, MaxI                                                      
       FFT%iBlockFace = iBlock                                                                                   
                                                                                                                 
       State_VGB(1,i,j,k,iBlock) = FFT%iBlockFace                                                                
       ! More calculation here                                                                                   
    end do; end do; end do                                                                                       
                                                                                                                 
                                                                                                                 
  end subroutine sub_inner                                                                                       
  !----------------------------------------------------------------                                              
                                                                                                                 
  subroutine sub_main                                                                                            
    integer:: iBlock                                                                                             
    !----------------------------------------------------------------                                            
                                                                                                                 
    !$acc parallel loop gang present(state_vgb)                                                                  
    do iBlock = 1, MaxBlock                                                                                      
       call sub_inner(iBlock)                                                                                    
    end do                                                                                                       
                                                                                                                 
    !$acc update host(state_vgb)                                                                                 
                                                                                                                 
    write(*,*)'state_vgb = ', State_VGB                                                                          
                                                                                                                 
  end subroutine sub_main                                                                                        
                                                                                                                 
end module ModTest                                                                                               
!=========================================================                                                       
                                                                                                                 
program main                                                                                                     
  use ModTest                                                                                                    
                                                                                                                 
  call sub_main                                                                                                  
                                                                                                                 
end program main  

I compiled the code above with

nvfortran -acc -g -O0 -Mcuda=debug -Minfo=all file.f90

and run with cuda-gdb, it produce the following error:

CUDA Exception: Warp Illegal Address
The exception was triggered at PC 0x154c1d8 (bug_report_fail.f90:22)

Thread 1 "a.out" received signal CUDA_EXCEPTION_14, Warp Illegal Address.
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]
0x000000000154c5a8 in modtest_sub_inner_ () at bug_report_fail.f90:24
24	       State_VGB(1,i,j,k,iBlock) = FFT%iBlockFace

Why it does not work?

Thanks!

Looks like a code gen issue so I’ve added a problem report, TPR #29601, and sent it to engineering for further review.

Note that it does appear that the compiler is implicitly privatizing FFT so the work around is to remove the “private(FFT)” clause.

Hi Mat,

I did some experiments after removing ‘private(FFV)’. Here is the code I used:

module ModTest                                                                                             
  implicit none                                                                                            
                                                                                                           
  type:: FaceFluxTest                                                                                      
     integer :: iBlockFace                                                                                 
  end type FaceFluxTest                                                                                    
                                                                                                           
  integer, parameter:: MinI = -1, MaxI = 4, MinJ = 1, MaxJ = 2, MinK = 1, MaxK = 2, MaxBlock = 2, nVar=2   
  real:: State_VGB(nVar,MinI:MaxI,MinJ:MaxJ,MinK:MaxK,MaxBlock)                                            
  !$acc declare create(State_VGB)                                                                          
                                                                                                           
contains                                                                                                   
                                                                                                           
  subroutine sub_inner(iBlock)                                                                             
    !$acc routine vector                                                                                   
    integer, intent(in)::iBlock                                                                            
    type(FaceFluxTest) :: FFT                                                                              
    integer:: i,j,k,iVar                                                                                   
                                                                                                           
    !$acc loop vector collapse(3)                                                                          
    do k = MinK, MaxK; do j = MinJ, MaxJ; do i = MinI, MaxI                                                
       FFT%iBlockFace = i                                                                                  
       State_VGB(1,i,j,k,iBlock) =  FFT%iBlockFace                                                                                               
    end do; end do; end do                                                                                 
                                                                                                           
                                                                                                           
  end subroutine sub_inner                                                                                 
  !----------------------------------------------------------------                                        
                                                                                                           
  subroutine sub_main                                                                                      
    integer:: iBlock                                                                                       
    !----------------------------------------------------------------                                      
                                                                                                           
    !$acc parallel loop gang present(state_vgb)                                                            
    do iBlock = 1, MaxBlock                                                                                
       call sub_inner(iBlock)                                                                              
    end do                                                                                                 
                                                                                                           
    !$acc update host(state_vgb)                                                                           
                                                                                                           
    write(*,*)'state_vgb = ', State_VGB                                                                    
                                                                                                           
  end subroutine sub_main                                                                                  
                                                                                                           
end module ModTest                                                                                         
!=========================================================                                                 
                                                                                                           
program main                                                                                               
  use ModTest                                                                                              
                                                                                                           
  call sub_main                                                                                            
                                                                                                           
end program main 

The results are wrong. It seems FFT is not private by default.

What’s the results you’re seeing? When I compare the output with and without OpenACC, the results are the same:

% nvfortran test1.f90 -acc -V21.1 ; a.out
 state_vgb =    -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
   -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
   -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
   -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
   -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
   -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
   -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
   -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
 % nvfortran test1.f90 -V21.1 ; a.out
 state_vgb =    -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
   -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
   -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
   -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
   -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
   -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
   -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000
   -1.000000        0.000000        0.000000        0.000000
    1.000000        0.000000        2.000000        0.000000
    3.000000        0.000000        4.000000        0.000000

If it is compiled with ‘nvfortran -acc -g -O0 test.f90’, the output is:

 state_vgb =     4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000     
    4.000000        0.000000        4.000000        0.000000 

If I remove ‘-g’, it produces correct results. The nvfortran version is 20.9.

This is just a sample code. In general, I guess there is no guarantee that the local derived type FFT will be private without the ‘private(FFT)’ clause, right? Is there a simple way to check whether the compiler implicitly privatizes a variable? Thanks!

Unfortunately, no. I inspected the generated code and from that concluded that each thread is getting it’s own private copy of FFT. Not sure what’s going wrong when “-g” is applied.

Adding the private clause is the correct way to go, but as noted, there does appear to be a compiler issue when adding UDTs to a private clause in a vector routine.