PGI worksation and MPICH2

I installed PGI workstaiton 10.3 on centos 5.5 x64. I intend to combine CUDA fortran, cula and mpi together. I have installed mpich2, test results is right.I want to use MPICh2, becuase my mpich1 of pgi installed can’t work, so i make a siterc file in bin folder both in x86 & x64.

my siterc file
set MPIUDIR=/opt/mpich2-install;

I wrote a simple mpihello.cuf to test, but may be have some wrong with lib dependence.

my siterc file
program hello
	use cudafor
	use mpi
	integer :: istat,numdevices
	integer :: ierr,cpuid,numprocs,namelen
	character* (mpi_max_processor_name) processor_name
	integer :: id,gpuid
	type(cudadeviceprop) :: prop
	istat=cudaGetDeviceCount(numdevices)
	call mpi_init(ierr)
	call mpi_comm_rank(mpi_comm_world,cpuid,ierr)
	call mpi_comm_size(mpi_comm_world,numprocs,ierr) 
	call mpi_get_processor_name(processor_name,namelen,ierr)

	gpuid=mod(cpuid,numdevices)
	istat=cudaSetDevice(gpuid)
	istat=cudaGetDeviceProperties(prop,gpuid)
	if (cpuid==0) write(*,"(a9,i2,a12)") "There are",numdevices,"GPU device!"
	write (*,"(a21,i2,a4,i1,a4,a30)"), "Hello world! process ",cpuid," of ",numprocs," on ",processor_name
	write (*,"(a6,i2)") "GPU id",gpuid
	write (*,"(a12,a20)") "Device name ",prop%name
	call mpi_finalize(ierr)
end

my makefile

FC=pgfortran

#Change to -Mmpi2 for MPICH2
MPI=-Mmpi=mpich2
#add cuf
CUDA=-ta=nvidia -Mcuda
#lib
LIB=-L/usr/local/cula/lib64 -lcula_pgfortran
mpihello:
	$(FC)  $(MPI) $(LIB) -o mpihello   mpihello.cuf

error information

pgfortran  -Mmpi=mpich2 -L/usr/local/cula/lib64 -lcula_pgfortran -o mpihello   mpihello.cuf
/opt/mpich2-install/lib/libmpich.a(init.o): In function `MPI_Init':
init.c:(.text+0x38): undefined reference to `MPL_env2str'
init.c:(.text+0x55): undefined reference to `MPL_env2bool'
/opt/mpich2-install/lib/libmpich.a(initthread.o): In function `MPI_Init_thread':
initthread.c:(.text+0x4b1): undefined reference to `MPL_env2bool'
/opt/mpich2-install/lib/libmpich.a(param_vals.o): In function `MPIR_Param_init_params':
param_vals.c:(.text+0xf): undefined reference to `MPL_env2int'
param_vals.c:(.text+0x27): undefined reference to `MPL_env2int'
param_vals.c:(.text+0x3f): undefined reference to `MPL_env2int'
param_vals.c:(.text+0x57): undefined reference to `MPL_env2int'
param_vals.c:(.text+0x6f): undefined reference to `MPL_env2int'
/opt/mpich2-install/lib/libmpich.a(param_vals.o):param_vals.c:(.text+0x87): more undefined references to `MPL_env2int' follow
/opt/mpich2-install/lib/libmpich.a(param_vals.o): In function `MPIR_Param_init_params':
param_vals.c:(.text+0x327): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x33f): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x357): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x36f): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x387): undefined reference to `MPL_env2bool'
/opt/mpich2-install/lib/libmpich.a(param_vals.o):param_vals.c:(.text+0x39f): more undefined references to `MPL_env2bool' follow
/opt/mpich2-install/lib/libmpich.a(param_vals.o): In function `MPIR_Param_init_params':
param_vals.c:(.text+0x3e7): undefined reference to `MPL_env2int'
param_vals.c:(.text+0x3ff): undefined reference to `MPL_env2int'
param_vals.c:(.text+0x417): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x42f): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x447): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x45f): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x477): undefined reference to `MPL_env2int'
param_vals.c:(.text+0x48f): undefined reference to `MPL_env2int'
param_vals.c:(.text+0x4a7): undefined reference to `MPL_env2int'
param_vals.c:(.text+0x4bf): undefined reference to `MPL_env2int'
param_vals.c:(.text+0x4d7): undefined reference to `MPL_env2str'
param_vals.c:(.text+0x4ef): undefined reference to `MPL_env2str'
param_vals.c:(.text+0x507): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x51f): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x537): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x54f): undefined reference to `MPL_env2bool'
param_vals.c:(.text+0x567): undefined reference to `MPL_env2bool'
/opt/mpich2-install/lib/libmpich.a(param_vals.o):param_vals.c:(.text+0x57f): more undefined references to `MPL_env2bool' follow
/opt/mpich2-install/lib/libmpich.a(mpid_vc.o): In function `MPIDI_Populate_vc_node_ids':
mpid_vc.c:(.text+0x2d2): undefined reference to `MPL_env2int'
mpid_vc.c:(.text+0x2e4): undefined reference to `MPL_env2int'
mpid_vc.c:(.text+0x30b): undefined reference to `MPL_env2bool'
/opt/mpich2-install/lib/libmpich.a(tcp_init.o): In function `MPID_nem_tcp_bind':
tcp_init.c:(.text+0x18f): undefined reference to `MPL_env2range'

Thanks!
siwuxie

Hello,

Let’s separate the mpi issues from the CUDA fortran issues.

  1. % more mpihello.f
    program hello
    include ‘mpif.h’
    integer ierr, myproc,hostnm, nprocs
    character*64 hostname
    call mpi_init(ierr)
    CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr)
    call mpi_comm_rank(MPI_COMM_WORLD, myproc, ierr)
    ierr=setvbuf3f(6,2,0)
    ierr=hostnm(hostname)
    write(6,100) myproc,nprocs,hostname
    100 format(1x,“hello - I am process”,i3," of", i3, " on host ",A32)
    call mpi_finalize(ierr)
    end

First build this with your mpich2 libs by hand - make sure the libraries
come after the source files in the compile line. use -v to see what is actually happening. It should look something like this
(see the output of pgfortran -dryrun x.o -Mmpi=mpich2)


pgfortran -o mpihello mpihello.f -v -L/opt/mpich2-install/lib -lfmpich -lmpichf90 -lmpich

Once you build mpihello and it works by booting the mpich2 daemon and
using mpiexe or mpirun from your mpich2 bin directory, then try your cuda version.

If you put ‘-Wl,-t’ in your link line, you can see which libraries are linked,
and determine if there are problems in library order or missing libs.


dave

It looks you have built your own mpich2. -Mmpi=mpich2 works with PGI mpich2 that comes with the PGI CDK.

If you have configured and built your own mpich, then you will need to use mpif90.

Hongyon

Thanks, jtull and hongyon.
The way you provided still doesn’t work. But I change another worktation, and install pgi on it. This time i don’t install mpich2, mpich1 pgi provided works well.
So, a workstation both installed mpich1 and mpich2 is dangerous. They may be conficted with each other. Hope pgi workstation will add mpich2 in the future.
There’s another question. I 'm using a single workstation not a cluster, and I choose SSH during install. So how to config to make it using all node without password? It’s too complex to input password for all the node.

There’s another question. I 'm using a single workstation not a cluster, and I choose SSH during install. So how to config to make it using all node without password? It’s too complex to input password for all the node.

Do a web search for “ssh without password”. You’ll find many guides.

  • Mat