Duplicated OpenMP functions defined in several libraries

Hi, all:

I’m trying to use Magma library in my cuda fortran and openACC mixed code on windows 10. I successfully got magma 2.50 compiled after some effort, which uses intel MKL. But I got following link error when linking the library to my code:
VCOMP.lib(VCOMP140.DLL) : error LNK2005: omp_get_num_threads already defined in libpgc.lib(omp.obj)
VCOMP.lib(VCOMP140.DLL) : error LNK2005: omp_set_num_threads already defined in libpgc.lib(lcpu.obj)
VCOMP.lib(VCOMP140.DLL) : error LNK2005: omp_get_thread_num already defined in libpgc.lib(omp.obj)

It seems that these openMP functions have been defined in several libraries: intel MKL’s libiomp5md.lib, PGI’s libpgc.lib and VCOMP.lib. So my question is how to solve this duplicated symbol definition problem?

In the link option, I already intentionally removed the link to intel’s libiomp5md.lib, but it will be best to use the functions from that library. So the best option will be to tell the pgf90 linker to ignore libpgc.lib and VCOMP.lib. Is there any PGI compiler options similar to /NODEFAULT in visual studio to skip specific default library? There is an option -Mnostdlib but it disables all default library and is too much.

Thanks in advance.

Jianhua

You can pass linker options via the “-Wl,” or “-Xlinker ” compiler options, so using “-Xlinker /FORCE:MULTIPLE” will the Microsoft linker to ignore the multiple definitions. However, they will resolve to the first symbols it encounters (i.e. those in libpgc.lib) which will most likely cause issues when MKL tries to use OpenMP. Our new NV OpenMP runtime is compatible, but currently is only available on Linux. While we are working on a Windows port, it wont be available for awhile. I would recommend that you avoid mixing OpenMP run times on Windows until this support is available.

Is there any PGI compiler options similar to /NODEFAULT in visual studio to skip specific default library?

Yes, “-nodefaultlib=”. Though, the Fortran runtime depends on the PGC runtime library so you wont be able to remove it without encountering undefined reference errors.

I would recommend that you either use the non-OpenMP enable MKL library or the PGI provided OpenBLAS (-lblas) which is comparable in performance.

Sans that, the only thing I can think of that might work (no guarantees and I’ve never tried it myself), is for you to save and edit the linker script to re-order the libraries and run the link from the command line adding /FORCE:MULTIPLE.

Adding “-v” (verbose) will show you all the commands the pgf90 driver is making, including the link line. “–keeplnk” will save the linker script (named “pgi.lnk”), which you can then edit.

For example something like:

pgf90 -v --keeplnk ... rest of link options ...
edit pgi.lnk to reorder the libraries so VCOMP.lib comes before libpgc.lib
then run (the exact location of link.exe may be different on your system)
    "C:/Program Files (x86)/Microsoft Visual Studio/2017/Community/VC/Tools/MSVC/14.16.27023/bin/Hostx64/x64\link.exe" /NOLOGO /FORCE:MULTIPLE @pgi.lnk

Hi, Mat:

Thanks for your detail explanations and suggestions. I tried “-Xlinker /FORCE:MULTIPLE” and to my surprise it worked very well and my code gives me the exact same results as those from intel compiled sequential code. So it seems that intel MKL can work with openMP implementation in libpgc.lib.

Here is the output from the linker with -Xlinker /FORCE:MULTIPLE option:
VCOMP.lib(VCOMP140.DLL) : warning LNK4006: omp_get_num_threads already defined in libpgc.lib(omp.obj); second definition ignored
VCOMP.lib(VCOMP140.DLL) : warning LNK4006: omp_set_num_threads already defined in libpgc.lib(lcpu.obj); second definition ignored
VCOMP.lib(VCOMP140.DLL) : warning LNK4006: omp_get_thread_num already defined in libpgc.lib(omp.obj); second definition ignored

Note I didn’t link against intel’s libiomp5md.lib. So I’m definitely using the openMP from libpgc.

Thanks again for your help.

Jianhua

Ok, good, though is MKL actually running with multiple OpenMP threads? I’ve never tried this myself, and rarely use Windows, so don’t really know, but the old PGI OpenMP runtime use inlined OpenMP regions while Intel and our newer NVOMP uses outlined regions. The two implementation methods aren’t compatible and why I’m suspect. Though if it works, that’s great! Worst case MKL runs, but just not in parallel, and any loss in performance here would be gained back using OpenACC and a GPU.

Hi, Mat:

I’m also curious about whether intel MKL runs in parallel . I’m still in debug mode now and haven’t done the performance test yet. I will let you know when I find out later. But as you mentioned, I already noticed significant speedup of my parallel GPU code.

Thanks,

Jianhua