Using pgdbg in parallel

Hi,

I’m starting pgdbg in this way to debug my Fortran program:

mpirun -np 4 -dbg=pgdbg mas_0.2.1.3_dbg


but pgdbg only reports one process (see screenshot) and when I check the number of processors with mpi_comm_size I always get one.

What am I doing wrong? Thanks!

Cheers,

RL

Hi,

Which version of PGI, pgdbg?
What OS?

Please try the following program and let us know if you still get 1 process.

program hello
include ‘mpif.h’
integer ierr, myproc
call mpi_init(ierr)
call mpi_comm_rank(MPI_COMM_WORLD, myproc, ierr)
print *, “Hello world! I’m node”, myproc
call mpi_finalize(ierr)

end


Hongyon

Hi,

pgf90 10.6-0 64-bit target on x86-64 Linux -tp gh-64
pgdbg 10.6-0

I’m running Debian Squeeze.

The “Hello world” program shows 4 processors, my program, which is of course much more complex, only one.
Maybe because I’m using modules.

MPI_Init() is where it spawns the processes. If you can narrow down and give us a small program, perhaps we can see what the problem is.

Hongyon

I discovered something very interesting (or at least I think so). To run my program, I need to provide an argument. If I don’t provide the argument, the 4 processes are indeed initiated by MPI_Init(), but of course my internal error checking does not allow me to continue the run.
But this is true also with your “Hello world” program, although no argument is necessary to run it.

If I load

mpirun -np 4 -dbg=pgdbg hello

and type

run myargument

PI_Init() will not initiate 4 processes but only one.

Hi,

I think I understand what the problem is. Did you do something like this?

prompt% pgdbg mpirun -np 4 .-dbg=pgdbg ./a.out


That probably why it fails to run with 4 processes. It will run as a single process and if you have some communication between processes, it will fail big time.


In order to debug an MPI program in a debugger, the command you need to run is as follow:-

prompt% mpirun -np 4 -dbg=pgdbg ./a.out

This is how MPICH1 debugging works. Did I misunderstand something here?


Hongyon

No,

I did start with

mpirun -np 4 -dbg=pgdbg hello

If I do:

PGDBG 10.6-0 x86-64 (Workstation, 8 Process)
Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved.
Copyright 2000-2010, STMicroelectronics, Inc. All Rights Reserved.
Loaded: /home/lionel/hello


pgdbg> ignore 10
Ignore signal SIGUSR1(10)

pgdbg> break main
(1)breakpoint set at: hello line: “hello.f”@4 address: 0x406E34
1
pgdbg> run -p4pg /home/lionel/PI31029 -p4wd /home/lionel -mpichtv
libpthread.so.0 loaded by ld-linux-x86-64.so.2.
librt.so.1 loaded by ld-linux-x86-64.so.2.
libm.so.6 loaded by ld-linux-x86-64.so.2.
libc.so.6 loaded by ld-linux-x86-64.so.2.
Breakpoint at 0x406E34, function hello, file hello.f, line 4
#4: call mpi_init(ierr)

pgdbg>

pgdbg> next
libnss_files.so.2 loaded by ld-linux-x86-64.so.2.
libnss_compat.so.2 loaded by ld-linux-x86-64.so.2.
libnsl.so.1 loaded by ld-linux-x86-64.so.2.
libnss_nis.so.2 loaded by ld-linux-x86-64.so.2.
Warning!!! The current implementation of PGDBG supports
the debugging MPI application with randomized loading by
duplicating symbol information for each process. If
PGDBG is consuming too much memory, you should consider
running:
sysctl -w kernel.randomize_va_space=0
libpthread.so.0 loaded by ld-linux-x86-64.so.2.
librt.so.1 loaded by ld-linux-x86-64.so.2.
libm.so.6 loaded by ld-linux-x86-64.so.2.
libc.so.6 loaded by ld-linux-x86-64.so.2.
ld-linux-x86-64.so.2 loaded by ld-linux-x86-64.so.2.
libnss_files.so.2 loaded by ld-linux-x86-64.so.2.
libpthread.so.0 loaded by ld-linux-x86-64.so.2.
librt.so.1 loaded by ld-linux-x86-64.so.2.
libm.so.6 loaded by ld-linux-x86-64.so.2.
libc.so.6 loaded by ld-linux-x86-64.so.2.
ld-linux-x86-64.so.2 loaded by ld-linux-x86-64.so.2.
libnss_files.so.2 loaded by ld-linux-x86-64.so.2.
libpthread.so.0 loaded by ld-linux-x86-64.so.2.
librt.so.1 loaded by ld-linux-x86-64.so.2.
libm.so.6 loaded by ld-linux-x86-64.so.2.
libc.so.6 loaded by ld-linux-x86-64.so.2.
ld-linux-x86-64.so.2 loaded by ld-linux-x86-64.so.2.
libnss_files.so.2 loaded by ld-linux-x86-64.so.2.
([1] New Process)
([2] New Process)
([3] New Process)
[0] Stopped at 0x406E3D, function hello, file hello.f, line 5
#5: call mpi_comm_rank(MPI_COMM_WORLD, myproc, ierr)

pgdbg [all] 0>

then, it’s OK.
However, if I do

PGDBG 10.6-0 x86-64 (Workstation, 8 Process)
Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved.
Copyright 2000-2010, STMicroelectronics, Inc. All Rights Reserved.
Loaded: /home/lionel/hello


pgdbg> ignore 10
Ignore signal SIGUSR1(10)

pgdbg> break main
(1)breakpoint set at: hello line: “hello.f”@4 address: 0x406E34
1
pgdbg> run -p4pg /home/lionel/PI31405 -p4wd /home/lionel -mpichtv
libpthread.so.0 loaded by ld-linux-x86-64.so.2.
librt.so.1 loaded by ld-linux-x86-64.so.2.
libm.so.6 loaded by ld-linux-x86-64.so.2.
libc.so.6 loaded by ld-linux-x86-64.so.2.
Breakpoint at 0x406E34, function hello, file hello.f, line 4
#4: call mpi_init(ierr)

pgdbg>

pgdbg> run argument

Reloading:
/home/lionel/hello argument

libpthread.so.0 loaded by ld-linux-x86-64.so.2.
librt.so.1 loaded by ld-linux-x86-64.so.2.
libm.so.6 loaded by ld-linux-x86-64.so.2.
libc.so.6 loaded by ld-linux-x86-64.so.2.
Breakpoint at 0x406E34, function hello, file hello.f, line 4
#4: call mpi_init(ierr)

pgdbg> next
libnss_files.so.2 loaded by ld-linux-x86-64.so.2.
libnss_compat.so.2 loaded by ld-linux-x86-64.so.2.
libnsl.so.1 loaded by ld-linux-x86-64.so.2.
libnss_nis.so.2 loaded by ld-linux-x86-64.so.2.
Stopped at 0x406E3D, function hello, file hello.f, line 5
#5: call mpi_comm_rank(MPI_COMM_WORLD, myproc, ierr)

pgdbg>

as you can see, only one process is initialized.

For MPI Program, you need to restart the program using mpirun everytime you want to debug a program.

Basically, mpirun will spawn a process(es) and the debugger will attach to each process mpirun spawns of and display as a single entity on the GUI.

When you type a “run” command, it basically restart a new single program. At this point, pgdbg has no capability to spawn new mpi process(es).

Hongyon