Hi all,
I have a strange error and I am starting to doubt wether it could not be related to a compiler issue… I run a rather complex code (weather model) and have gotten a segmentation fault uppon calling of a routine which is quite some time into the exectution of the code. The routine is calle organize_output… I’ve tried to reduce it as much as possible to still get an error and now it looks like this…
SUBROUTINE organize_output
REAL (KIND=irealgrib) :: zprocarray_grib(ie_max,je_max,num_compute)
REAL (KIND=ireals) :: zvarlev(ie,je,0:MAX(ke+1,nlevels)), &
zprocarray_real(ie_max,je_max,num_compute), slev(0:MAX(ke+1,nlevels))
REAL (KIND=ireals) :: zenith_t (ie,je), zenith_w (ie,je), zenith_h (ie,je), &
zcape_mu (ie,je), zcin_mu (ie,je), zcape_ml (ie,je), zcin_ml (ie,je), &
zcape_3km(ie,je), zlcl_ml (ie,je), zlfc_ml (ie,je), zbrn (ie,je,ke)
print *,'*** beginning of subroutine organize_output'
print *,zbrn(1,1,1)
print *,'gugu'
print *,zprocarray_grib(1,1,1)
print *,zvarlev(1,1,0)
print *,zprocarray_real(1,1,1)
print *,slev(0)
print *,zenith_t(1,1)
print *,zenith_w(1,1)
print *,zenith_h(1,1)
print *,zcape_mu(1,1)
print *,zcin_mu(1,1)
print *,zcape_ml(1,1)
print *,zcin_ml(1,1)
print *,zcape_3km(1,1)
print *,zlcl_ml(1,1)
print *,zlfc_ml(1,1)
print *,'*** end of subroutine organize_output'
END SUBROUTINE organize_output
Upon execution the output is as follows…
*** before_call_to_organize_output
num_compute= 1
nlevels= 40
ie,je= 41 51
ie_max,je_max= 41 51
nzmxid= 130
*** calling
*** beginning of subroutine organize_output
Segmentation fault (core dumped)
Sometimes (depending on the details of the lines still remaining in the subroutine) the error message is also…
*** before_call_to_organize_output
num_compute= 1
nlevels= 40
ie,je= 41 51
ie_max,je_max= 41 51
nzmxid= 130
*** calling
0: ALLOCATE: 18446744071899487520 bytes requested; not enough memory
Upon access to the zbrn array, the code segfaults. I’ve tried “unlimit; setenv MPSTKZ 40000000” with no effect. The code is VERY sensitive to any changes in what remains in the routine… If I remove one line (either in the declarations or the print statements) the behaviour can change to run smoothly without any error.
My compilation options are…
pgf90 -c -I. -I/nfs/xt3-homes/users/olifu/src/lm_4.7_dwd/src -I/opt/xt-mpt/default/mpich2-64/P2/include -I/apps/netcdf/linux/include -Mfree -Mpreprocess -Kieee -Mbyteswapio -O0 -C -g -gopt -Mbounds -Mchkfpstk -Ktrap=fp -o src_output.o /nfs/xt3-homes/users/olifu/src/lm_4.7_dwd/src/src_output.f90
My machine is a Cray XT-4 and I am running on the service nodes for debugging purposes…
uname -a
Linux buin2 2.6.5-7.283-ss #4 SMP Fri Sep 28 13:24:48 PDT 2007 x86_64 x86_64 x86_64 GNU/Linux
The version of pgf90 I use is…
pgf90 -V
pgf90 7.2-4 64-bit target on x86-64 Linux -tp k8-64e
Can anyone give me an idea to what might cause this type of behaviour?
I would be very grateful for any suggestions,
Oliver