MM5 segmentation fault with -DDEC_ALPHA

Hi!

We work on a SuSE 9.2 AMD 64 Kernel 2.6.8-24.19-smp #1 SMP Tue Nov 29 14:32:45 UTC 2005 system with 2 AMD 64 processor and want to use both.
We did follow the instructions on the webpage http://www.pgroup.com/resources/mm5/mm537_pgi60_64.htm
and added C_ALPHA but we get segmentation faults.

If I don’t add -DDEC_ALPHA MM5 runs and gives me 3 outputfile (MMOUT_DOMAIN1_00, MMOUT_DOMAIN1_01, MMOUT_DOMAIN1_02) but seems to run (top output) only on one processor.

What do we have to add to the configure file to run MM5 on both processor?

Thanks Irene

Hi Irene,

For the seg fault, my guess is that you were using the 32-bit compilers or have the flag “-tp p6” included in you compilation. The “-tp” switch changes the the target processor and p6 targets a Pentium III. Double check which compiler your using (“which pgf90”) and if the tp flag is there.

For the single process problem, please make sure “-mp -Mnosigmp” is used in your compilation and that the environment variable “NCPUS” is set to 2.

Hope this helps,
Mat

Hi Mat!

Thanks for the help, we did remove “-tp p6” and changed “-pc 32” to “-pc 64” and it works now.

Cheers,
Irene

Hi Irene,

I’m glad I could help. On a side note, the “-pc” flag adjusts the precision control of the x87 FPU which is not used in 64-bits. It’s ok to have on the compile line but the driver will silently ignore it.

  • Mat

I did try your suggestions but it does’nt work. We have:
MM5V3.7 on 2 CPUS on a 2 x 2.4 Ghz dual opteron suse 9.2
amd64 Kernel 2.6.8-24.19-smp #1 SMP Tue Nov 29 14:32:45 UTC 2005 and 2 GB
RAM(4 x 512 MB 400 MHz DDR ECC DIMM) with the pgf90 6.0-2 64-bit target on
x86-64 Linux compiler.
It works fine on 1 CPU with with the following options:
FC = pgf90
FCFLAGS = -I$(LIBINCLUDE) -O2 -Mcray=pointer -tp p6 -pc 32 -Mnoframe
-byteswapio -mp -Mnosgimp
CPP = /lib/cpp
CFLAGS = -O
CPPFLAGS = -I$(LIBINCLUDE)
LDOPTIONS = -O2 -Mcray=pointer -tp p6 -pc 32 -Mnoframe -byteswapio -mp
LOCAL_LIBRARIES =
MAKE = make -i -r

but if I want to let it work on 2 CPUs with either: -DDEC_ALPHA or with setenv
NCPUS 2 or both and several other changes (-tp amd64 -pc 64) I get
segmentation faults.
I did try a lot
and I’m running out of ideas now.

Did you remember to set your stack size to ‘unlimited’?

Here are our settings:
cputime unlimited
filesize unlimited
datasize unlimited
stacksize unlimited
coredumpsize 0 kbytes
memoryuse unlimited
vmemoryuse unlimited
descriptors 1024
memorylocked 32 kbytes
maxproc 16383

cheers Irene

Hi Irene,

Try starting from the begining in a clean directory (making sure no old objects are still around) using the following config file for 64-bits: http://www.pgroup.com/resources/mm5/configure_60.linux64.

Next set your environment variable NCPUS to 2, set your stack size to unlimited, and then run the simulation. If it still seg faults, try setting the stack size to a very large value like 500MB (‘unlimited’ actually has a limit) and run again.

If that doesn’t work, the only other reason I can think of is that your problem size is simply too large. If that’s case, then try running in 32-bits since less stack space is used. For your system, replace “-tp p6” with “-tp k8-32”.

  • Mat

We used a debugger now and got the following message:

gdb) run
Starting program: /delta/user/irene/MM5/Run/mm5.exe
[Thread debugging using libthread_db enabled]
[New Thread 1433387136 (LWP 10757)]
*************** MULTI LEVEL RUN!!! ***************
*************** 2 DOMAIN TOTAL ***************

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1433387136 (LWP 10757)]
0x08065270 in param (iexec=(1, 1)) at param.F:622
622 DX=BHR(9,1) PARAM.571
Current language: auto; currently fortran

Hi Irene,

I can see no reason why a seg fault would occur at this particular line. Can you please send your complete config file to trs@pgroup and ask Customer Support to forward it to me? Also, are you using the example “Storm of the Century” data or your own data?

Thanks,
Mat

Hi Irene,

I received your config file from Customer Support. Although it ws configured for 32-bits, I’m assuming that the setting other than the compiler flags are the same for both 32 and 64-bits.

While I’m not certain, my guess is that your running out of memory. You have the number of domains (MAXNES) set to 4 which will double the required amount memory. Can you try running on another system which has more than 2Gb of memory?

Another thing to try is to compile with “-Mlarge_arrays” in 64-bits.

If neither ot these solves the issue, is it possible to obtain a copy of your data as well as you mm5.deck file?

  • Mat

Hi Mat,

We did solve it now, we found out that OpenMP blows through the stack size, we used now setenv MP_STACK_SIZE 64000000.
I will try the “-Mlarge_arrays” on the other system when I get my account on it (can take a wile).

Irene

Odd. The PGI OpenMP runtime doesn’t recognize MP_STACK_SIZE. We do have MPSTKZ but this is mainly for Windows and older versions of Linux. In the 7.0 release, we added OpenMP 3.0’s OMP_STACK_SIZE variable, but since 6.0, I this shouldn’t apply. Are you sure you were using the PGI compilers?

  • Mat