I’m a student at Lyndon State College and I’m having some trouble here running a fortran parallel program on some Clusters. Code cannot be compiled on the Clusters, so it is compiled in the lab here and you can run them on the Clusters. However, the Clusters run on an AMD 32-bit CPU while the lab runs on an Intel 64-bit CPU. Here are the specifics after running a CAT on a lab computer and one of the clusters. Please let me know if there are any scripts or compiler flags I should know when compiling like this. Thank you.
macielj@kangaroo:~> ssh annex06
macielj@annex06:~> cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 6
model name : Intel(R) Pentium(R) D CPU 3.20GHz
stepping : 4
cpu MHz : 3192.211
cache size : 2048 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 6
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni
monitor ds_cpl vmx est cid cx16 xtpr lahf_lm
bogomips : 6390.01
processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 6
model name : Intel(R) Pentium(R) D CPU 3.20GHz
stepping : 4
cpu MHz : 3192.211
cache size : 2048 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 6
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni
monitor ds_cpl vmx est cid cx16 xtpr lahf_lm
bogomips : 6384.30
macielj@annex06:~> logout
Connection to annex06 closed.
macielj@kangaroo:~> ssh cluster01
macielj@cluster01:~> cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 8
model name : AMD Athlon™ XP 2400+
stepping : 1
cpu MHz : 2002.645
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow up ts
bogomips : 4009.11
Yes, I tired -tp k8-32 but it was saying “Illegal Instruction” so I’m guessing it is having difficulty reading the instructions from the object file.
I also tried
-pc 6
-tpp6
There were a number of different flags I tried but I kept coming up with “Illegal Instruction.” The code compiles with that flag, it just doesn’t run on the Clusters.
macielj@kangaroo:~/Parallel/PARALLELCFD> mpif90 -tp px -o cfd_cluster.exe cfd_mpi.f
/software/mpich/pgi/lib/libmpich.a(p4_utils.o)(.text+0x19dd): In function p4_usclock': : undefined reference to __mth_i_dfloatux’
/software/mpich/pgi/lib/libmpich.a(p4_utils.o)(.text+0x19ef): In function p4_usclock': : undefined reference to __mth_i_dfloatux’
/software/mpich/pgi/lib/libmpich.a(p4_utils.o)(.text+0x1a02): In function p4_usclock': : undefined reference to __mth_i_dfloatux’
It looks like the MPI library does not compile with -tp px.
As the cluster that you are going to run the MPI program has older architecture, you will need to compile MPI libraries with -tp px option.
Alright, so when I use mpirun, I think I’ll have to compile with mpif90, for example. From the set flags above, will it know to refer to the pgf90 set for -tp px?
The reason that you need to compile MPI libraries with -tp px because of 2 reasons:
The MPI libraries you have installed on build system(Intel) does not compile with -tp px, therefore it won’t run on your run system(AMD Athlon) even if it compiles with no problem.
Since your MPI was compiled for newer architecture, it links with new optimized libraries. So, compiling a program with -tp px will get a linking error because the compiler driver does not link with newer optimized libraries. Even if you get it to link by force, it won’t run as you get an error with illegal instruction.
Now, assuming you get MPI compile and installed with -tp px:
If you pass -tp px as a flag to the compilers through mpif90, it should be fine as mpif90 is just a script and it passes all the flags to the compilers and the compiler should know what to do correctly.
Did you download MPICH1 source file? You need to download the source file first, untar it, and run ./configure from the directory you untar it to, then type: make, and then type: make install.
Please look at the Tips and Technique webpage I mention above for where to get MPICH1 source.
If /software/mpich/mpich-1.2.7p1 contains the source and you have permission to write in that directory, you certainly can run configure in that directory. Be sure to do: make clean before you run command: make as you never know if somebody has already built in that directory.
I would recommend you get help from the sys admin. or whoever put the source there in this build.
Regarding you previous post about you already have mpich installed. We talked about this already that the installed MPICH libraries you have will not run on the cluster you intend to run. That’s why we need to build a new MPI libraries with flag -tp px so that it will run on older architecture cluster.