MPI problems with pgi 7.0-7

Hi,

we are running the old pgi 6.2 stuff on our cluster.
Now I’ve installed the current version (7.0-7) which is working fine.
In the ne xt step I added the MPI stuff (also downloaded from pgi)
and installed as described in the readme.
Everything seems to be fine, I can compile and also compile
parallel code. But running this code with mpirun
is not working.

If I force local execution everything seems to be normal.
ssh is also working without any problems, also the old
mpi-installation.

taiga:~ # mpirun -np 4 mpihello
hydra.bgc-jena.mpg.de: Connection refused
p0_21213: p4_error: Child process exited while making connection to remote process on hydra: 0
p0_21213: (37.031250) net_send: could not write to fd=4, errno = 32

How can I figure out the reason for this behavior ?



Kind regards,
Peer

Some questions.

Did you install MPI with ssh or rsh?
Did you install as root?

% more mpihello.f
program hello
include ‘mpif.h’
integer ierr, myproc,hostnm
character*64 hostname
call mpi_init(ierr)
call mpi_comm_rank(MPI_COMM_WORLD, myproc, ierr)
! print *, “Hello world! I’m node”, myproc
ierr=setvbuf3f(6,2,0)
ierr=hostnm(hostname)
write(6,100) myproc,hostname
100 format(1x,“hello - I am process”,i3," host ",A10)
call mpi_finalize(ierr)
end

What happens when you compile and run as follows

pgf90 -o mpihello mpihello.f -Mmpi -Bstatic_pgi -v

mpirun -np 4 mpihello

Hi,

I installed everything as root. We are using ssh only.
I downloaded mpich v1 and v2 from the homepage and both are working
WITHOUT any problems …

0 errors or warnings.

pkoch@tchita:~> mpirun -np 4 mpihello
pc002.bgc-jena.mpg.de: Connection refused
p0_3937: p4_error: Child process exited while making connection to remote process on pc002: 0
p0_3937: (37.121094) net_send: could not write to fd=5, errno = 32
pkoch@tchita:~> ll ./mpihello
-rwxr-xr-x 1 pkoch AG_DV 699120 2007-08-27 14:17 ./mpihello


ssh is working:
pkoch@tchita:~> ssh pc002 w
14:20:39 up 14 days, 3:40, 0 users, load average: 1.00, 1.03, 1.05
USER TTY LOGIN@ IDLE JCPU PCPU WHAT