Linking pgi (OpenACC) and icc (OpenMP) code

schnefia · April 20, 2015, 2:26pm

Hi,

I need to link a project that consists of files containing OpenACC code compiled with the pgi compiler and some files containing OpenMP code compiled with the icc compiler.
If I use the pgi compiler for linking, the following errors occur:

pgcc -fastsse -mp -acc -Minfo=accel -ta=nvidia,pin,cc20 -Mlarge_arrays  [...] -lm
[...] undefined reference to `__intel_new_feature_proc_init'
[...] undefined reference to `__kmpc_begin'
[...] undefined reference to `__kmpc_end'
[...] undefined reference to `_intel_fast_memset'
...
pgacclnk: child process exit status 1: /usr/bin/ld

Similarly linking with icc leads to:

icc -O3 -ip -axCORE-AVX2,AVX,SSE4.2,SSE4.1 -fp-model fast=2  -openmp -g  -o [...] -lm
[...] undefined reference to `__pgi_uacc_dataenterstart'
[...] undefined reference to `__pgi_uacc_dataonb'
[...] undefined reference to `__pgi_uacc_dataonb'
[...] undefined reference to `__pgi_uacc_dataonb'
[...] undefined reference to `__pgi_uacc_dataenterdone'
...

Is there a trick to get this to work?

Thanks,
Fabian

MatColgrove · April 20, 2015, 3:46pm

Hi Fabian,

You need to add the runtime libraries for the other compiler. I don’t know what these would be for Intel, but for PGI they would be:

-L<path_to_pgi_lib> -laccapimp -laccgmp -laccnmp -laccg2mp -ldl -lcudadevice -lpgmp -lnuma -lpthread -lnspgc -lpgc

You can get this information by running:

pgcc -fastsse -mp -acc -Minfo=accel -ta=nvidia,pin,cc20,nordc -Mlarge_arrays -dryrun x.o

Note that I added “nordc” to the -ta sub-options. This disables the device linking step which is not performed by the Intel compiler. You’ll be unable to use some OpenACC features such as “routine”. While, I’ve only done minimal amount of testing having Intel call PGI OpenACC code and have not tired it from Intel OpenMP code, but it should work.

Hope this helps,
Mat

schnefia · April 22, 2015, 12:08pm

Hi Mat,

thank you for your fast answer!
With your tip I got the code to link (with icc) but it crashed with a segmentation fault when executing an OpenACC kernel.
Finally I got it to work by linking it with pgcc (the -dryrun option also works with the intel compiler).

Fabian

MatColgrove · April 22, 2015, 2:00pm

In the crashing case, did you compile with “-ta=tesla:nordc”?

Mat

schnefia · April 23, 2015, 6:15am

Hi Mat,

Yes, I did. If I don’t use nordc, I get the message

 Accelerator Fatal Error: No CUDA device code available

But as linking with pgcc works fine, I will do the linking with the pgi compiler.

Thanks,
Fabian

MatColgrove · April 23, 2015, 2:04pm

Ok, thanks Fabian.