I am trying to compile the parallel version of MM5 using PGI’s fortran and c compilers 7.0-7. There are some peculiar warning messages when compiling the c sources, such as the following:
The problem seems to be that the compiler does not recognize the include file rsl.h in source file set_padarea.c where all the variables are defined and assumes integer types for these variables. Is this some bug or am I missing a compilation flag??
Thanks for your reply. I guess that the type of the function should be explicitly defined.
I am trying now to use the pgf compiler for the fortran sources files of MM5 and the gcc compiler for the c source files. I use the -g77libs flags which allows objects files created by gcc to be linked with object files created by pgf90.
It seems that the object files are created without any warnings, but later on (during the linking stage I think) I get some pointer warnings like the following:
drive.c: In function `yylook’:
drive.c:1593: warning: cast from pointer to integer of different sizedrive.c:1603: warning: cast from pointer to integer of different size
cc -c error.c
cc -c lexin.clexin.c:113: warning: initialization makes integer from pointer without a cast
Do you know if I need to use a different set of flags for compatibility between these two languages???
My compilation flags for pgf90 are;
FCFLAGS = -fast -Mcray=pointer -tp p7-64 -Mnoframe -DDEC_ALPHA -byteswapio -Mnosgimp -g77libs
and for gcc;
CPPFLAGS = -traditional -DMPI -Dlinux -DDEC_APLHA -DSYSTEM_CALL_OK
CFLAGS = -DMPI -I/opt/mpich/gnu/include -DDEC_ALPHA
It is helpfull to know that you have compiled and run MM5 despite these warnings. However, I get a segmentation fault when I run the program and I was sucpecting that these compilation warnings were related to this problem.
Could it be a missing compilation flag in my case??
Here is my configure.user file for the pgf90 and pgcc compilers.
Please let me know if see anything strange in that.
What is the OS? Is this RedHat/Fedora system? It is known that there is a stack size problem on those system. I see that you try with 7.0-7. Have you tried with our latest release?
Our cluster runs the Rocks 4.3 clustering suite which comes with Centos 4.5. I guess that you can say that Centos is variant of RedHat.
I have installed the ClusterCorp’s PGI Roll which contains version 7.0-7 of the compilers. I am not sure if it is straightforward to install version 7.1 on my system without an appropiate Rocks Roll. I will give it a try though and let you know.
I have changed the stack size and also installed the latest version of the compilers 7.1, but without any luck. Can you please try to compile and run the parallel version of mm5.mpp to see if it works on your systems with the flags I gave you?? It would be very helpfull if you could reproduce my error.
I followed the steps you described and also set the stack size to unlimited.
The compilation seems to fail when I use pgf77 but it works with pgf90 and the program finally runs!!
The errors look like this:
/opt/mpich/pgi/bin/mpif77 -c -fast -Mcray=pointer -Mnoframe -DDEC_ALPHA -byteswapio diffintp.f 2> diffintp.lis
make[1]: [diffintp.o] Error 2 (ignored)
This is the first one that appears.
When I use mpif90 instead of mpif77, there is no error and I have successfully run the test case.
I have however several warnings in both cases, like the following:
/opt/mpich/pgi/bin/mpicc -c -I/opt/mpich/pgi/include -DMPI -DRSL_SYNCIO -Dlinux -DSWAPBYTES -O -DMPI2_SUPPORT -DDEC_ALPHA -DIMAX_MAKE= -DJMAX_MAKE= -DMAXDOM_MAKE=6 -DMAXPROC_MAKE=256 -DHOST_NODE=0 -DMON_LOW=1 -DALLOW_RSL_168PT=1 set_padarea.c
PGC-W-0156-Type not specified, ‘int’ assumed (set_padarea.c: 62)
PGC/x86-64 Linux 7.0-7: compilation completed with warnings
OR
/opt/mpich/pgi/bin/mpicc -c -I/opt/mpich/pgi/include -DMPI -DRSL_SYNCIO -Dlinux -DSWAPBYTES -O -DMPI2_SUPPORT -DDEC_ALPHA -DIMAX_MAKE= -DJMAX_MAKE= -DMAXDOM_MAKE=6 -DMAXPROC_MAKE=256 -DHOST_NODE=0 -DMON_LOW=1 -DALLOW_RSL_168PT=1 domain_def.c
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 193)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 334)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 335)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 336)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 476)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 579)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 612)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 799)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 856)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 925)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 1030)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 1101)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 1213)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 1227)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 1228)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 1229)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (domain_def.c: 1230)
PGC-W-0156-Type not specified, ‘int’ assumed (domain_def.c: 1527)
PGC/x86-64 Linux 7.0-7: compilation completed with warnings
As we discussed earlier these warnings appear on 64bit machines and they may cause segmentation faults. In fact I get a segmentation error when I run my weather scenario (which has always worked on a serial 32-bit machine with pgf90), despite the fact that the test case runs on my machine.
I am now considering to change to the 32-bit version of pgf90 to get rid of this warnings and possibly the segmentation errors.
Do you get such warnings when compiling the test case on your cluster??
Yes, I get those warnings. I doubt it is the reason your program get seg. faults. I looked at the codes, they are legitimate warnings. They looks OK to me. Some of the warnings due to;
no type function, that’s why you get assume ‘int’.
they declare function prototypes with no arguments(rsl_malloc), but they pass argument(s) to function. I look at function definitions, which take 3 arguments, so they match OK.
The third warnings about passing long to a nonprototypes.
Perhaps the problem might be the input? If the problem is too large? I am not sure where the problem is , perhaps MM5 folks know something. You might get lucky with 32-bit but 64-bit could exposed the problem.
Please always try our latest release. An older release could have the bugs. I do not see any bug in MM5 in 7.0, however.
We have installed clustercorp’s pgi roll with the latest compilers 7.1.3 x86-64. Finally all our mm5 simulations run without any problem. I don’t know what exactly was the problem with version 7.0.7.