Hello,
I am testing the new 21.11 HPC compilers, and I am seeing some surprising behaviour with the -tp
parameter.
The test program, test.c
, just prints whether or not #ifdef __ABM__
evaluates true.
With 21.9, this macro was never automatically defined on our system, which has a Skylake Xeon Gold 6140:
$ nvc++ -V
nvc++ 21.9-0 64-bit target on x86-64 Linux -tp skylake
$ nvc -V21.9 test.c && ./a.out
__ABM__ not defined
$ nvc -V21.9 -tp host test.c && ./a.out
__ABM__ not defined
$ nvc -V21.9 -tp skylake-avx512 test.c && ./a.out
__ABM__ not defined
$ nvc -V21.9 -tp zen2 test.c && ./a.out
__ABM__ defined
With 21.11 the situation has changed. First, it seems this CPU is detected as skylake-avx512
instead of skylake
(fair enough):
$ nvc -V
nvc 21.11-0 64-bit target on x86-64 Linux -tp skylake-avx512
but by default the __ABM__
macro is now defined:
$ nvc -V21.11 test.c && ./a.out
__ABM__ defined
$ nvc -V21.11 -tp host test.c && ./a.out
__ABM__ defined
but it is not defined if I explicitly specify the skylake-avx512
architecture that nvc -V
tells me was detected:
$ nvc -V21.11 -tp skylake-avx512 test.c && ./a.out
__ABM__ not defined
Is this intentional? It seems surprising that -tp host
behaves differently to -tp arch_that_is_apparently_detected_for_host
, but maybe I am over-interpreting the -tp X
part of the nvc -V
output.
The background here is that one of our dependencies assumes that the macro __ABM__
being defined means intrin.h
will exist, which now fails on our system with 21.11.
In the meantime, I suppose we can explicitly pass -tp skylake
, or patch the dependency.
Best, Olli