My group is working on porting a CFD code to GPU with OpenACC. As a first step, we wanted to compile the code on CPU but issues came up.
On a Linux x86 system, it shows segmentation fault with
nvfortran -g -Ktrap=fp -r8 -O0. In some tests, it shows the line number where it encounters an array index out-of-bound error, where the displayed array range makes no sense to me:
0: Subscript out of range for array mhdflux_v (FaceFlux.f90: 2983) subscript=2, lower bound=1640695287988, upper bound=1640695287990, dimension=1
In some other tests, it does not show the line number, just messages like
[ny01:09070] *** Process received signal *** [ny01:09070] Signal: Segmentation fault (11) [ny01:09070] Signal code: (128) [ny01:09070] Failing at address: (nil)
The code that is causing this issue is related to the usage of derived types in Fortran. We have a large derived type declared in one module and used in another module, with a mixture of scalars and vectors that looks like
type, public :: Param integer :: iLeft, jLeft, kLeft integer :: iRight, jRight, kRight integer :: iBlockFace integer :: iDimFace integer :: iFluidMin = 1, iFluidMax = nFluid integer :: iVarMin = 1, iVarMax = nVar integer :: iEnergyMin = nVar+1, iEnergyMax = nVar + nFluid integer :: iFace, jFace, kFace real :: CmaxDt real :: Area2, AreaX, AreaY, AreaZ, Area = 0.0 real :: DeltaBnL, DeltaBnR real :: DiffBb ! (1/4)(BnL-BnR)^2 real :: StateLeft_V(nVar) real :: StateRight_V(nVar) real :: FluxLeft_V(nVar+nFluid), FluxRight_V(nVar+nFluid) real :: Normal_D(3), NormalX, NormalY, NormalZ real :: Tangent1_D(3), Tangent2_D(3) real :: B0n, B0t1, B0t2 real :: UnL, Ut1L, Ut2L, B1nL, B1t1L, B1t2L real :: UnR, Ut1R, Ut2R, B1nR, B1t1R, B1t2R real :: MhdFlux_V( RhoUx_:RhoUz_) real :: MhdFluxLeft_V( RhoUx_:RhoUz_) real :: MhdFluxRight_V(RhoUx_:RhoUz_) real :: Enormal real :: Unormal_I(nFluid+1) = 0.0 real :: UnLeft_I(nFluid+1) real :: UnRight_I(nFluid+1) real :: EtaJx, EtaJy, EtaJz, Eta real :: InvDxyz, HallCoeff real :: HallJx, HallJy, HallJz logical :: UseHallGradPe = .false. real :: BiermannCoeff, GradXPeNe, GradYPeNe, GradZPeNe real :: DiffCoef, EradFlux=0.0, RadDiffCoef real :: HeatFlux, IonHeatFlux, HeatCondCoefNormal real :: bCrossArea_D(3) = 0.0 real :: B0x=0.0, B0y=0.0, B0z=0.0 real :: ViscoCoeff logical :: IsBoundary real :: InvClightFace, InvClight2Face logical :: DoTestCell = .false. logical :: IsNewBlockVisco = .true. logical :: IsNewBlockGradPe = .true. logical :: IsNewBlockCurrent = .true. logical :: IsNewBlockHeatCond = .true. logical :: IsNewBlockIonHeatCond = .true. logical :: IsNewBlockRadDiffusion = .true. logical :: IsNewBlockAlfven = .true. end type Param
An object of this derived type is passed between several subroutines to set the parameters and intermediate values.
One of the arrays with declared range 2:4 in this derived type caused the issue. I have tried several different approaches to resolve this issue:
- turn off OpenMP
- use local array (copy) instead of pointer to the vectors
- direct call with
p%MhdFlux_V, etc., without using the
- change vector range from
- move the vectors into a separate type declaration
However, none of these works. An older version of this module without using derived types can be compiled and run without issue, which indicates that there’s something going on with the usage of derived type.
-O2 or above, the code does not generate runtime error, but the result is wrong. We have confirmed that the same code has no issue with gfortran, nagfor and ifort. We have also run valgrind with gcc, and it showed no memory issue.