Using System OpenMPI with OpenACC

Hi all,

I’m trying hard to use a OpenMPI version of PETSc with OpenACC.
Unfortunately this does simply not work for me.

I was not able to compile and run my PETSc application with the MPI from PGI. Therefore, I tried to run my OpenACC application with MPI from PETSc. That works for 1 MPI - process.
But if I start 2 MPI processes the system does not execute the OpenACC data handling pragmas/libraries. And I get (as a consequence) the runtime error:
(null) lives at 0x411ae430 size 2227088 partially present

I checked the processes during runtime with: export PGI_ACC_NOTIFY=3.
I know (compared with 1 MPI process) that the program has to copy data to the device before the bug occurs. But it does not :-(

I know this question is a bit “vague”, but what do I have to do, that I can use several MPI processes with OpenACC (but without using mpirun from PGI)?

Thanks in advance,
Stefan

PS: I want to add some system infos, it might help:

  1. -> mpicxx --version :
    g++ (Ubuntu 5.4.1-11ubuntu2~16.04.york0) 5.4.1 20170519
    Copyright © 2015 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions. There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

  2. -> which pgc++ (it links to pgi 17.5):
    /opt/pgi/linux86-64/2017/bin/pgc++

  3. -> which mpirun :
    /home/rosenbs/src/petsc/./arch-linux2-c-debug//bin/mpirun

  4. -> mpirun -version :
    mpirun (Open MPI) 1.8.5

  5. And I can show you (parts of) my ~/.bashrc (I’m using Linux Mint 18.1)
    #MPI
    export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/opt/pgi/linux86-64/2017/mpi/openmpi/lib:/opt/pgi/linux86-64/17.5/lib/"$LD_LIBRARY_PATH
    export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/$USER/.openmpi/lib/"$LD_LIBRARY_PATH
    export PATH="$PATH:/home/$USER/.openmpi/bin"$PATH
    export PATH=$/home/rosenbs/src/petsc/bin:$PATH

#PGI
export PATH=$PATH/opt/pgi/linux86-64/2017/bin:
export MANPATH=$MANPATH:/opt/pgi/linux86-64/2017/man
export PGI=/opt/pgi
export LM_LICENSE_FILE=$LM_LICENSE_FILE:/opt/pgi/license.dat

#PETSc
export PETSC_DIR=/home/rosenbs/src/petsc
export PETSC_ARCH=./arch-linux2-c-debug/
export MPI_ROOT=$PETSC_DIR/$PETSC_ARCH/bin
export PATH=$MPI_ROOT:$PATH

Hi Stefan,

I have gotten PetSc to build with PGI 17.5 and the PGI build of Open MPI 1.10.2 here. However, as far as I know, PetSc itself does not appear to support OpenACC currently - only CUDA or OpenCL. Therefore, if I read what you wrote correctly, then your application contains both OpenACC and PetSc calls.

First of all, could you state what commands you used to attempt to build PetSc with the PGI compilers and Open MPI?

Thanks,

+chris

Hi Chris,

thanks for the fast reply!

PetSc itself does not appear to support OpenACC currently - only CUDA or OpenCL. Therefore, if I read what you wrote correctly, then your application contains both OpenACC and PetSc calls

Perhaps I described my application a bit to rough. Sorry for that! You are completely right, I have an application which USES PETSc, and I have written a “separate stand alone” C++ program with OpenACC calls. And now I want to combine the application I use (which needs PETSc) with my (new) written C++ - OpenACC program.

First of all, could you state what commands you used to attempt to build PetSc with the PGI compilers and Open MPI?

I tried several commands (which do not work for my system):

./configure -with-cc=pgcc -with-cxx=pgc++ -with-fc=pgfortran -download-fblaslapack -with-mpi-dir=/opt/pgi/linux86-64/2017/mpi/openmpi/lib -download-scalapack -download-mumps -download-parmetis -download-metis

Error: Fortran compiler you provided with --with-fc=pgfortran does not work.
Cannot compile FC with pgfortran.

Note, the compiler pgfortran exists on my system (I checked with which pgfortran).

./configure -with-cc=pgcc -with-cxx=pgc++ -with-fc=gfortran -download-fblaslapack -with-mpi-dir=/opt/pgi/linux86-64/2017/mpi/openmpi/lib -download-scalapack -download-mumps -download-parmetis -download-metis

Error: --with-mpi-dir=/opt/pgi/linux86-64/2017/mpi/openmpi/lib did not work

Note, I tried all paths I found on my system for mpi-dir (which I thought might work)

./configure -with-cc=pgcc -with-cxx=pgc++ -with-fc=gfortran -download-fblaslapack -download-openmpi -download-scalapack -download-mumps -download-parmetis -download-metis

Error: **********************************************************************configure: WARNING: unrecognized options: --with-rsh
configure: WARNING:  -finline-functions has been added to CXXFLAGS
configure: WARNING:  -finline-functions has been added to CXXFLAGS
configure: WARNING: Open MPI now ignores the F77 and FFLAGS environment variables; only the FC and FCFLAGS environment variables are used.
configure: error: C and Fortran compilers are not link compatible.  Can not continue.
*******************************************************************************

As a consequence, I configured PETSc without PGI:

./configure -with-cc=gcc -with-cxx=g++ -with-fc=gfortran -download-fblaslapack -download-openmpi -download-scalapack -download-mumps -download-parmetis -download-metis

Which works on my system.

The next step was to unify my new application with the existing one. This worked out, wherein I always had to use mpi run from PETSc (note: it worked for ONE MPI process with OpenACC). The problem occurs when I try to run 2 MPI processes.
The program starts normal, but it does not copy the data to the device. On the other hand, it calls (or I think it does) the OpenACC kernels on the device, which does not work out (of course), it can’t find the data :-(


To compile my OpenACC application I use the command:

pgc++ -O3 -std=c++11 -DNDEBUG -DOPENACC -acc -ta=tesla:cc60,fastmath -DUSE_SR_NAMESPACE -lnuma  -Mvect -Mprefetch -DFAST_AMG -DNOSSE -DP2P_v1  -DMUMPS    -c toolbox_funcs.cpp -I. -I/home/rosenbs/src/petsc/arch-linux2-c-debug/include

Note, for other parts of the code (with fortran, pyton, C) we have (of course) different commands.

The link-line is a bit longer :-/ :

pgcc -o bench.petsc.pt -O3 -DNDEBUG -DOPENACC -acc -ta=tesla:cc60,fastmath -DUSE_SR_NAMESPACE -lnuma -Mvect -Mprefetch -DUSE_PETSc -DUSE_PT -DSOLVE_CURRENT -DWITH_BLAS -DWITH_PURK -D_GNU_SOURCE -I. -I../CARP -I../LIMPET -I../FEMLIB -I../NumComp -I../libredblack-1.3/include -I../PrM/include -I../PT_C -I/home/rosenbs/src/petsc/include -I/home/rosenbs/src/petsc/arch-linux2-c-debug/include cmdline.o clamp.o ap_analyzer.o restitute.o stretch.o bench.o -L../LIMPET -llimpet.petsc.pt -L../NumComp -lNumComp.petsc.pt -Wl,-rpath,/home/rosenbs/src/petsc/./arch-linux2-c-debug//lib -L/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lpetsc -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lscalapack -lflapack -lfblas -lhwloc -lX11 -lssl -lcrypto -lm -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lgfortran -lm -lgfortran -lm -lquadmath -lmpi_cxx -lstdc++ -lm -Wl,-rpath,/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -L/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl -Wl,-rpath,/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -lmpi -lgcc_s -lpthread -ldl -lz -lm -ldl 
\e[0mpgcc -o bench.petsc.pt  -O3 -DNDEBUG -DOPENACC -acc -ta=tesla:cc60,fastmath -DUSE_SR_NAMESPACE -lnuma  -Mvect -Mprefetch -DUSE_PETSc -DUSE_PT     -DSOLVE_CURRENT    -DWITH_BLAS    -DWITH_PURK     -D_GNU_SOURCE -I.  -I../CARP -I../LIMPET -I../FEMLIB     -I../NumComp  -I../libredblack-1.3/include -I../PrM/include   -I../PT_C -I/home/rosenbs/src/petsc/include -I/home/rosenbs/src/petsc/arch-linux2-c-debug/include   cmdline.o clamp.o ap_analyzer.o restitute.o stretch.o bench.o -L../LIMPET -llimpet.petsc.pt -L../NumComp -lNumComp.petsc.pt   -Wl,-rpath,/home/rosenbs/src/petsc/./arch-linux2-c-debug//lib -L/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lpetsc -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lscalapack -lflapack -lfblas -lhwloc -lX11 -lssl -lcrypto -lm -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lgfortran -lm -lgfortran -lm -lquadmath -lmpi_cxx -lstdc++ -lm -Wl,-rpath,/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -L/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl -Wl,-rpath,/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -lmpi -lgcc_s -lpthread -ldl  -lz -lm -ldl

And for C++:

pgc++ -o carp.petsc.pt -O3 -std=c++11 -DNDEBUG -DOPENACC -acc -ta=tesla:cc60,fastmath -DUSE_SR_NAMESPACE -lnuma -Mvect -Mprefetch -DUSE_PETSc -DUSE_PT -DSOLVE_CURRENT -DWITH_BLAS -DWITH_PURK -D_GNU_SOURCE -I. -I../CARP -I../LIMPET -I../FEMLIB -I../NumComp -I../libredblack-1.3/include -I../PrM/include -I../PT_C -I/home/rosenbs/src/petsc/include -I/home/rosenbs/src/petsc/arch-linux2-c-debug/include carp.o mapping.o carp_p.o aux.o grid_output.o grid_info.o generic_grid_output.o cubic_hermite.o diffusion.o igb.o IOutils.o magnetics.o mesh.o Operators.o reformat_input.o Stimulation.o utils.o post_process.o electrics.o async-io.o volumes.o purkinje.o -L. -L../LIMPET -llimpet.petsc.pt -L../FEMLIB -lGlFEM.petsc.pt -L../NumComp -lNumComp.petsc.pt -L../libredblack-1.3/lib -lredblack -lm -lz -ldl -L../PT_C -lpt -Wl,-rpath,/home/rosenbs/src/petsc/./arch-linux2-c-debug//lib -L/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lpetsc -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lscalapack -lflapack -lfblas -lhwloc -lX11 -lssl -lcrypto -lm -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lgfortran -lm -lgfortran -lm -lquadmath -lmpi_cxx -lstdc++ -lm -Wl,-rpath,/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -L/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl -Wl,-rpath,/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -lmpi -lgcc_s -lpthread -ldl -lparmetis -lmetis 
\e[0mpgc++ -o carp.petsc.pt   -O3 -std=c++11 -DNDEBUG -DOPENACC -acc -ta=tesla:cc60,fastmath -DUSE_SR_NAMESPACE -lnuma  -Mvect -Mprefetch -DUSE_PETSc -DUSE_PT     -DSOLVE_CURRENT    -DWITH_BLAS    -DWITH_PURK     -D_GNU_SOURCE -I.  -I../CARP -I../LIMPET -I../FEMLIB     -I../NumComp  -I../libredblack-1.3/include -I../PrM/include   -I../PT_C -I/home/rosenbs/src/petsc/include -I/home/rosenbs/src/petsc/arch-linux2-c-debug/include  carp.o mapping.o carp_p.o aux.o grid_output.o grid_info.o generic_grid_output.o cubic_hermite.o diffusion.o igb.o IOutils.o magnetics.o mesh.o Operators.o reformat_input.o Stimulation.o utils.o post_process.o electrics.o async-io.o volumes.o purkinje.o  -L.   -L../LIMPET -llimpet.petsc.pt -L../FEMLIB -lGlFEM.petsc.pt   -L../NumComp -lNumComp.petsc.pt  -L../libredblack-1.3/lib -lredblack -lm -lz -ldl  -L../PT_C -lpt -Wl,-rpath,/home/rosenbs/src/petsc/./arch-linux2-c-debug//lib -L/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lpetsc -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lscalapack -lflapack -lfblas -lhwloc -lX11 -lssl -lcrypto -lm -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lgfortran -lm -lgfortran -lm -lquadmath -lmpi_cxx -lstdc++ -lm -Wl,-rpath,/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -L/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl -Wl,-rpath,/home/rosenbs/src/petsc/arch-linux2-c-debug/lib -lmpi -lgcc_s -lpthread -ldl     -lparmetis -lmetis

The answer is a bit “long” this time, sorry for that. But I’m not sure which information might help you to understand my problem.

Thanks and greetings from Austria,
Stefan

Just to be clear, verify the following.

  1. you have been able to build PETsc with PGI compilers w/o
    any OpenACC switches. We can.

  2. Your C++ OpenACC program builds and links with PETsc, when you
    DO NOT compile with and GPU/OpenACC switches, but use pgc++ to
    compile and link.

Note the following switches for pgc++ when linking objects built with
other PGI compilers and switches:

-acclibs Append Accelerator libraries to the link line
-cudalibs Link with CUDA-enabled libraries
-pgc++libs Append gnu compatible C++ libraries to the link line
-pgf77libs Append pgf77 libraries to the link line
-pgf90libs Append pgf90 libraries to the link line

These may help make things easier to link, without needing to be careful about library order.

Once you have determined you can build and link your codes without
OpenACC, using the PGI compilers, then add -acc to your compilations. When that works go back and try to build with other compilers.