warnings when compiling parallel mm5 with pgcc v7

Dear all,

I am trying to compile the parallel version of MM5 using PGI’s fortran and c compilers 7.0-7. There are some peculiar warning messages when compiling the c sources, such as the following:

/opt/mpich/pgi/bin/mpicc -c -I/opt/mpich/pgi/include -DMPI -DRSL_SYNCIO -Dlinux -DSWAPBYTES -O -DMPI2_SUPPORT -Mpreprocess -DIMAX_MAKE= -DJMAX_MAKE= -DMAXDOM_MAKE=6 -DMAXPROC_MAK
E=256 -DHOST_NODE=0 -DMON_LOW=1 -DALLOW_RSL_168PT=1 set_padarea.c
PGC-W-0156-Type not specified, ‘int’ assumed (set_padarea.c: 62)
PGC/x86-64 Linux 7.0-7: compilation completed with warnings

The problem seems to be that the compiler does not recognize the include file rsl.h in source file set_padarea.c where all the variables are defined and assumes integer types for these variables. Is this some bug or am I missing a compilation flag??

regards
Spyros

Hi Spyros,

Actually the compiler complains about this line:
RSL_SET_PADAREA(…)

where there is no type defined for RSL_SET_PADAREA. The line number is off by 1, sorry for the confusion.

Hongyon

Thanks for your reply. I guess that the type of the function should be explicitly defined.

I am trying now to use the pgf compiler for the fortran sources files of MM5 and the gcc compiler for the c source files. I use the -g77libs flags which allows objects files created by gcc to be linked with object files created by pgf90.
It seems that the object files are created without any warnings, but later on (during the linking stage I think) I get some pointer warnings like the following:

drive.c: In function `yylook’:
drive.c:1593: warning: cast from pointer to integer of different sizedrive.c:1603: warning: cast from pointer to integer of different size
cc -c error.c
cc -c lexin.clexin.c:113: warning: initialization makes integer from pointer without a cast

Do you know if I need to use a different set of flags for compatibility between these two languages???
My compilation flags for pgf90 are;
FCFLAGS = -fast -Mcray=pointer -tp p7-64 -Mnoframe -DDEC_ALPHA -byteswapio -Mnosgimp -g77libs
and for gcc;
CPPFLAGS = -traditional -DMPI -Dlinux -DDEC_APLHA -DSYSTEM_CALL_OK
CFLAGS = -DMPI -I/opt/mpich/gnu/include -DDEC_ALPHA


Thanks
Spyros

Hi Spyros,

The flags look correct. The error appears on 64-bit system the application cast pointer(8bytes) to integer(4bytes).

I have compiled and run MM5 with those warnings with no error. If you want to make sure if it is safe, you can ask MM5 authors.

Hongyon

Thank Hongyon,

It is helpfull to know that you have compiled and run MM5 despite these warnings. However, I get a segmentation fault when I run the program and I was sucpecting that these compilation warnings were related to this problem.
Could it be a missing compilation flag in my case??
Here is my configure.user file for the pgf90 and pgcc compilers.
Please let me know if see anything strange in that.

RUNTIME_SYSTEM = “linux”
MPP_TARGET=$(RUNTIME_SYSTEM)
LINUX_MPIHOME=/opt/mpich/pgi
MFC = $(LINUX_MPIHOME)/bin/mpif90
MCC = $(LINUX_MPIHOME)/bin/mpicc
MLD = $(LINUX_MPIHOME)/bin/mpif90
FCFLAGS = -fast -Mcray=pointer -tp p7-64 -Mnoframe -DDEC_ALPHA -byteswapio -Mnosgimp
LDOPTIONS = -fast -Mcray=pointer -tp p7-64 -Mnoframe -DDEC_ALPHA -byteswapio -Mnosgimp
LOCAL_LIBRARIES = -L$(LINUX_MPIHOME)/lib -lfmpich -lmpich
MAKE = make -i -r
AWK = awk
SED = sed
CAT = cat
CUT = cut
EXPAND = expand
M4 = m4
CPP = /lib/cpp -C -P -traditional
CPPFLAGS = -traditional -DMPI -Dlinux -DDEC_APLHA -DSYSTEM_CALL_OK
CFLAGS = -DMPI -I$(LINUX_MPIHOME)/include -DDEC_ALPHA

Thanks
Spyros

Hi Spyros,

What is the OS? Is this RedHat/Fedora system? It is known that there is a stack size problem on those system. I see that you try with 7.0-7. Have you tried with our latest release?

I will try and let you know the result.


Hongyon

Our cluster runs the Rocks 4.3 clustering suite which comes with Centos 4.5. I guess that you can say that Centos is variant of RedHat.
I have installed the ClusterCorp’s PGI Roll which contains version 7.0-7 of the compilers. I am not sure if it is straightforward to install version 7.1 on my system without an appropiate Rocks Roll. I will give it a try though and let you know.

Hi,

Try unlimit stack size first. It might just fix your problem. For MPI program, you might want to put in your shell start file(.cshrc/.bashrc).

Hongyon

Hi Hongyon,

I have changed the stack size and also installed the latest version of the compilers 7.1, but without any luck. Can you please try to compile and run the parallel version of mm5.mpp to see if it works on your systems with the flags I gave you?? It would be very helpfull if you could reproduce my error.

Thanks
Spyros

Hi Spyros,

I tried on our ROCKS 4.3 (CentOS 4.5) cluster we have here. It runs just fine through the end.

Here is step I did:

  1. untar MM5.TAR.gz
  2. untar MPP.TAR.gz in MM5 directory
  3. modifed configure.user (see below)
  4. type: make clean;make mpp_uninstall to make sure everything is clean
  5. type: make mpp
  6. make mm5.deck
  7. type: ./mm5.deck
  8. cd Run
  9. Copy input files in to Run directory:
    BDYOUT_DOMAIN1
    LOWBDY_DOMAIN1
    MMINPUT_DOMAIN2
    TERRAIN_DOMAIN2
    MMINPUT_DOMAIN1
  10. type: mpirun -np 2 mm5.mpp to start a run

Configuration:
The cluster I have if AMD so, I remove -tp p7-64,

RUNTIME_SYSTEM = “linux”
MPP_TARGET=$(RUNTIME_SYSTEM)

edit the following definition for your system

LINUX_MPIHOME = /opt/mpich/pgi
MFC = $(LINUX_MPIHOME)/bin/mpif77
MCC = $(LINUX_MPIHOME)/bin/mpicc
MLD = $(LINUX_MPIHOME)/bin/mpif77
FCFLAGS = -fast -Mcray=pointer -Mnoframe -byteswapio -DDEC_ALPHA
LDOPTIONS = -fast -Mcray=pointer -Mnoframe -byteswapio -DDEC_ALPHA
LOCAL_LIBRARIES = -L$(LINUX_MPIHOME)/lib -lfmpich -lmpich
MAKE = make -i -r
AWK = awk
SED = sed
CAT = cat
CUT = cut
EXPAND = expand
M4 = m4
CPP = /lib/cpp -C -P -traditional
CPPFLAGS = -DMPI -Dlinux -DSYSTEM_CALL_OK -DDEC_ALPHA
CFLAGS = -DMPI -I$(LINUX_MPIHOME)/include -DDEC_ALPHA
ARCH_OBJS = milliclock.o
IWORDSIZE = 4
RWORDSIZE = 4
LWORDSIZE = 4

Hongyon

Dear Hongyon,

I followed the steps you described and also set the stack size to unlimited.
The compilation seems to fail when I use pgf77 but it works with pgf90 and the program finally runs!!

Thank you very much
Spyros

Hi Spyros,

Can you please tell us which version of pgf77? I am trying to replicate the error. What kind of error messages did you get?

Hongyon

Hi Hongyon,

I am using version 7.0-7 64bit and when I use mpif77 in the configure.user file (instead of mpif90), the compilation fails:

LINUX_MPIHOME=/opt/mpich/pgi
MFC = $(LINUX_MPIHOME)/bin/mpif77
MCC = $(LINUX_MPIHOME)/bin/mpicc
MLD = $(LINUX_MPIHOME)/bin/mpif77
FCFLAGS = -fast -Mcray=pointer -Mnoframe -DDEC_ALPHA -byteswapio
LDOPTIONS = -fast -Mcray=pointer -Mnoframe -DDEC_ALPHA -byteswapio
LOCAL_LIBRARIES = -L$(LINUX_MPIHOME)/lib -lfmpich -lmpich

CPP = /lib/cpp -C -P -traditional
CPPFLAGS = -DMPI -Dlinux -DSYSTEM_CALL_OK -DDEC_ALPHA
CFLAGS = -DMPI -I$(LINUX_MPIHOME)/include -DDEC_ALPHA

The errors look like this:
/opt/mpich/pgi/bin/mpif77 -c -fast -Mcray=pointer -Mnoframe -DDEC_ALPHA -byteswapio diffintp.f 2> diffintp.lis
make[1]: [diffintp.o] Error 2 (ignored)
This is the first one that appears.

When I use mpif90 instead of mpif77, there is no error and I have successfully run the test case.
I have however several warnings in both cases, like the following:
/opt/mpich/pgi/bin/mpicc -c -I/opt/mpich/pgi/include -DMPI -DRSL_SYNCIO -Dlinux -DSWAPBYTES -O -DMPI2_SUPPORT -DDEC_ALPHA -DIMAX_MAKE= -DJMAX_MAKE= -DMAXDOM_MAKE=6 -DMAXPROC_MAKE=256 -DHOST_NODE=0 -DMON_LOW=1 -DALLOW_RSL_168PT=1 set_padarea.c
PGC-W-0156-Type not specified, ‘int’ assumed (set_padarea.c: 62)
PGC/x86-64 Linux 7.0-7: compilation completed with warnings

OR

/opt/mpich/pgi/bin/mpicc -c -I/opt/mpich/pgi/include -DMPI -DRSL_SYNCIO -Dlinux -DSWAPBYTES -O -DMPI2_SUPPORT -DDEC_ALPHA -DIMAX_MAKE= -DJMAX_MAKE= -DMAXDOM_MAKE=6 -DMAXPROC_MAKE=256 -DHOST_NODE=0 -DMON_LOW=1 -DALLOW_RSL_168PT=1 domain_def.c
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 193)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 334)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 335)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 336)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 476)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 579)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 612)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 799)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 856)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 925)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 1030)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 1101)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 1213)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 1227)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 1228)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 1229)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 1230)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 1527)
PGC/x86-64 Linux 7.0-7: compilation completed with warnings


As we discussed earlier these warnings appear on 64bit machines and they may cause segmentation faults. In fact I get a segmentation error when I run my weather scenario (which has always worked on a serial 32-bit machine with pgf90), despite the fact that the test case runs on my machine.
I am now considering to change to the 32-bit version of pgf90 to get rid of this warnings and possibly the segmentation errors.
Do you get such warnings when compiling the test case on your cluster??


Spyros

Hi Spyros,

Yes, I get those warnings. I doubt it is the reason your program get seg. faults. I looked at the codes, they are legitimate warnings. They looks OK to me. Some of the warnings due to;

  • no type function, that’s why you get assume ‘int’.
  • they declare function prototypes with no arguments(rsl_malloc), but they pass argument(s) to function. I look at function definitions, which take 3 arguments, so they match OK.
  • The third warnings about passing long to a nonprototypes.

Perhaps the problem might be the input? If the problem is too large? I am not sure where the problem is , perhaps MM5 folks know something. You might get lucky with 32-bit but 64-bit could exposed the problem.

Please always try our latest release. An older release could have the bugs. I do not see any bug in MM5 in 7.0, however.

Hongyon

Hi hongyon,

We have installed clustercorp’s pgi roll with the latest compilers 7.1.3 x86-64. Finally all our mm5 simulations run without any problem. I don’t know what exactly was the problem with version 7.0.7.

Thank you for your help
Spyros

Hi Spyros,

Thanks for letting us know. Perhaps there was some bug that didn’t show in my test data and it got fixed in 7.1.

Thank you,
Hongyon