Relocatable objects with pgf90 and mpich

My general problem is that I need to link my code with the FLEXlm libraries that are on my build machine, but I also need to link with the MPICH libraries on the customer’s machine (because they have weird locations for everything). My goal is to never have source code or non-stripped object code on the customer’s machine, but to deliver a MPI-capable executable. I have just learned about partial linking with the -r option. I suspect that this is what I need to use.

I want to test this locally, so I build each of my source files with:

pgf90 -w -Mpreprocess -Bstatic -DIS_MPI -fast -tp px -c dynamic.f

plus more of the same, and also two that look like this:

gcc -w -I/home/guest/flexlm/v10.1/machind -I/home/guest/flexlm/v10.1/i86_r9 -c license.c

(I guess I could also use pgcc for those.)

I create the relocatable object with:

pgf90 -Mnostdlib -w -Bstatic -r -I/home/guest/flexlm/v10.1/machind -L/home/guest/flexlm/v10.1/i86_r9 -llmgr_nomt -lcrvs -lsb -o reloc.o dynamic.o [....lots more .o files]

And I make the executable with:

mpif90 -w -Bstatic -Wl,--allow-multiple-definition -o runme reloc.o

I use the “–allow-multiple-definition” because the linker complains about multiple “main” and “init” routines. I only have one main. But it segfaults when I “mpirun” it.

I try building with debugging information (-g) , but gdb returns:

Breakpoint 1 at 0x8176d40
 
Program received signal SIGSEGV, Segmentation fault.
0xbfffe02c in ?? ()

Am I just understanding relocatable objects incorrectly? I couldn’t find any information in the pgiug.pdf manual, and the online “compiling and linking” tutorials don’t seem to address this need.

I use the “–allow-multiple-definition” because the linker complains about multiple “main” and “init” routines. I only have one main. But it segfaults when I “mpirun” it.

Sounds like your “main” program is in C. Try linking with “-Mnomain” so that the F90 main is not added to the link line.

  • Mat

The main is in the fortran, and the C contains no main. I then thought that the problem was from trying to statically link the (unspecified but nonetheless necessary) system libraries twice: once for each linking. That’s wrong, too. Even doing non-static partial linking comes up with errors:

$ pgf90 -Mlfs -r -s -fast -tp px -I/home/guest/flexlm/v10.1/machind -L/home/guest/flexlm/v10.1/i86_r9 -llmgr_nomt -lcrvs -lsb -o reloc.o runfmm.o setup.o legendre.o direct.o fmm1.o buildtree.o octree.o mod_octree.o boxcoev.o boxcoes.o interaction.o search.o  multipole.o fmm2v.o fmm2s.o influence.o kernel.o diffuse.o vrm.o cottet.o input.o matrix.o quadrature.o output.o dynamic.o lump_vortons.o split_vortons.o vary_core_size.o streamline.o timer.o parallel.o direct_par.o orb.o let_salmon.o octreeg.o boxcoeg.o license.o lm_new.o

$ mpif90 -Mlfs -s -fast -tp px -o runfmm_mpif90_any reloc.o

reloc.o(.rodata+0x0): multiple definition of `_fp_hw'
/usr/lib/crt1.o(.rodata+0x0):../sysdeps/i386/elf/start.S:47: first defined here
reloc.o(.data+0x4): In function `__data_start':
: multiple definition of `__dso_handle'
/usr/lib/gcc-lib/i386-redhat-linux/3.2.2/crtbegin.o(.data+0x0): first defined here
reloc.o(.init+0x0): In function `_init':
: multiple definition of `_init'
/usr/lib/crti.o(.init+0x0):/usr/src/build/324954-i386/BUILD/glibc-2.3.2-200304020432/build-i386-linux/csu/crti.S:12: first defined here
reloc.o(.text+0x0): In function `_start':
: multiple definition of `_start'
/usr/lib/crt1.o(.text+0x0):../sysdeps/i386/elf/start.S:47: first defined here
reloc.o(.text+0xc0): In function `main':
: multiple definition of `main'
/usr/pgi/linux86/5.1/lib/f90main.o(.text+0x10): first defined here
reloc.o(.fini+0x0): In function `_fini':
: multiple definition of `_fini'
/usr/lib/crti.o(.fini+0x0): first defined here
reloc.o(.rodata+0x4): multiple definition of `_IO_stdin_used'
/usr/lib/crt1.o(.rodata+0x4):../sysdeps/i386/elf/start.S:53: first defined here
reloc.o(.data+0x0): In function `__data_start':
: multiple definition of `__data_start'
/usr/lib/crt1.o(.data+0x0):../sysdeps/i386/elf/start.S:47: first defined here

Adding “-Mnomain” to either of the link commands just removes the one multiple occurence of “main.” Adding “-Mnostdlib” to the first linking command changes nothing.

The driver adds several system init files (such as crtbegin.o) and runtime libraries to the link line. Since your relocatable object already has these symbols defined, when the driver links the second time you get multiple definitions. To correct, on the second link use ld directly and only add the new MPICH libraries to the link. You can use “-v” to have the driver display the link line used.

A simple example:

% pgf90 -r hello.f90 -o hello.o
% /usr/bin/ld hello.o -o hello.out
% hello.out
 HELLO

Hope this helps,
Mat

Wow. Every time I think I know something, I realize I’m probably in over my head. That works for the local build. I will try it on the client machines now. That ld command-line is huge.

Thank you so much for your help. I will remember this in the future.