runing on pgi 7.2 mpich

I installed mpich1.2.7 with pgi7.2 as follows:
./configure -cc=pgcc -cflags=“-O3 -fastsse -Mvect=sse -tp k8-64” -c++=pgCC -fc=pgf77 -fflags=“-O3 -tp k8-64” -f90=pgf90 -rsh=ssh --prefix=/hptc_cluster/mpich1.2.7_GE_tcp|tee c2.txt

the examples of mpich for mpif90,mpif77,mpicc inside mpich(mpich1.2.7_GE_tcp/examples) works well. But when I do the following, it failed, I am not sure if my installation of pgi-mpich1.2.7 is right or not, or the errors come from the turb code it self.

$ mpirun -machinefile machinefile -np 4 ./turb
MYID = 0
Name =
hpconsole
nproc= 4
nx,ny,nz= 32 64 64
iseed_m0= 321
nstep= 10001
itout= 2000
ieout= 100
dt= 3.9999999E-04
rnu= 4.0000002E-03
time= 0.000000
new= T
idp= 0
forcr(1)= 0.3056840
forcr(2)= 0.1304520
memory allocated
32 64 64 4 321 10001
2000 100 3.9999999E-04 4.0000002E-03 0.000000 T
0 0.3056840 0.1304520
100 2000 31
u0= 0.7150000 akp= 4.578600 pi= 3.141593 ca=
1.3117581E-02
*** glibc detected *** /skovira/home/ccsad_1/zxiao_jhu/./turb: malloc(): memory corruption: 0x000000000bfe4360 ***
======= Backtrace: =========
/lib64/libc.so.6[0x36e2271cd1]
/lib64/libc.so.6(__libc_malloc+0x7d)[0x36e2272e8d]
/skovira/home/ccsad_1/zxiao_jhu/./turb[0x4bd4e2]
======= Memory map: ========
00400000-00539000 r-xp 00000000 68:03 10323608 /skovira/home/ccsad_1/zxiao_jhu/turb
00739000-00743000 rwxp 00139000 68:03 10323608 /skovira/home/ccsad_1/zxiao_jhu/turb
00743000-0236d000 rwxp 00743000 00:00 0
0bfe1000-0c027000 rwxp 0bfe1000 00:00 0
36e1200000-36e121a000 r-xp 00000000 68:03 32801075 /lib64/ld-2.5.so
36e141a000-36e141b000 r-xp 0001a000 68:03 32801075 /lib64/ld-2.5.so
36e141b000-36e141c000 rwxp 0001b000 68:03 32801075 /lib64/ld-2.5.so
36e2200000-36e234a000 r-xp 00000000 68:03 32801076 /lib64/libc-2.5.so
36e234a000-36e2549000 —p 0014a000 68:03 32801076 /lib64/libc-2.5.so
36e2549000-36e254d000 r-xp 00149000 68:03 32801076 /lib64/libc-2.5.so
36e254d000-36e254e000 rwxp 0014d000 68:03 32801076 /lib64/libc-2.5.so
36e254e000-36e2553000 rwxp 36e254e000 00:00 0
36e2600000-36e2682000 r-xp 00000000 68:03 32801081 /lib64/libm-2.5.so
36e2682000-36e2881000 —p 00082000 68:03 32801081 /lib64/libm-2.5.so
36e2881000-36e2882000 r-xp 00081000 68:03 32801081 /lib64/libm-2.5.so
36e2882000-36e2883000 rwxp 00082000 68:03 32801081 /lib64/libm-2.5.so
36e2e00000-36e2e15000 r-xp 00000000 68:03 32801078 /lib64/libpthread-2.5.so
36e2e15000-36e3014000 —p 00015000 68:03 32801078 /lib64/libpthread-2.5.so
36e3014000-36e3015000 r-xp 00014000 68:03 32801078 /lib64/libpthread-2.5.so
36e3015000-36e3016000 rwxp 00015000 68:03 32801078 /lib64/libpthread-2.5.so
36e3016000-36e301a000 rwxp 36e3016000 00:00 0
36e5e00000-36e5e07000 r-xp 00000000 68:03 32801085 /lib64/librt-2.5.so
36e5e07000-36e6007000 —p 00007000 68:03 32801085 /lib64/librt-2.5.so
36e6007000-36e6008000 r-xp 00007000 68:03 32801085 /lib64/librt-2.5.so
36e6008000-36e6009000 rwxp 00008000 68:03 32801085 /lib64/librt-2.5.so
36e6600000-36e660d000 r-xp 00000000 68:03 32801082 /lib64/libgcc_s-4.1.2-20080102.so.1
36e660d000-36e680d000 —p 0000d000 68:03 32801082 /lib64/libgcc_s-4.1.2-20080102.so.1
36e680d000-36e680e000 rwxp 0000d000 68:03 32801082 /lib64/libgcc_s-4.1.2-20080102.so.1
2b95fd013000-2b95fd014000 rwxp 2b95fd013000 00:00 0
2b95fd041000-2b95fd074000 rwxp 2b95fd041000 00:00 0
2b95fd0a1000-2b95fd0ab000 r-xp 00000000 68:03 32800796 /lib64/libnss_files-2.5.so
2b95fd0ab000-2b95fd2aa000 —p 0000a000 68:03 32800796 /lib64/libnss_files-2.5.so
2b95fd2aa000-2b95fd2ab000 r-xp 00009000 68:03 32800796 /lib64/libnss_files-2.5.so
2b95fd2ab000-2b95fd2ac000 rwxp 0000a000 68:03 32800796 /lib64/libnss_files-2.5.so
2b95fd2ac000-2b95fd6a7000 rwxp 2b95fd2ac000 00:00 0
2b9600000000-2b9600021000 rwxp 2b9600000000 00:00 0
2b9600021000-2b9604000000 —p 2b9600021000 00:00 0
7fffada82000-7fffada97000 rwxp 7fffada82000 00:00 0 [stack]
ffffffffff600000-ffffffffffe00000 —p 00000000 00:00 0 [vdso]
p0_10425: p4_error: interrupt SIGx: 6
Killed by signal 2.
Killed by signal 2.
Killed by signal 2.
p0_10425: (27.257812) net_send: could not write to fd=4, errno = 32

Hi sad,

While I don’t know for sure, it’s more likely a problem with your application. Try compiling your program with “-g” and link with the debugging version of MPICH that accompanies your compilers. You can then run your MPI code within the PGI debugger, PGDBG.

For details on using PGDBG, please see the PGI Tools Guide that is included with your compilers or found online at http://www.pgroup.com/doc/pgitools.pdf.

  • Mat

You are helpful, thank you very much, I will check this.