21.7 Illegal instruction (core dumped)

When running any program compiled with 21.7 compilers I see:

/host/test> nvfortran test.f90
/host/test> ./a.out 
Illegal instruction (core dumped)
/host/test> cat test.f90
program main
end

There were no such problems with the 21.5 compilers. The compiler is installed in a container and run on a reasonably old machine, my guess is some instruction (avx?) is now required that previously was not the case, but I can’t see anything in the release notes, any ideas? Probably I can switch the container to a different machine if necessary, but it would be good to know which instruction is now required, and I guess whether it’s available in the free runners on gitlab etc.

Many thanks,

Andy

Hi Andy,

Yes, typically this type of issue is when running a binary built for a newer processor is run on an older processor without support for a new instruction set.

The compiler auto-detects the processor on which the binary is built and will compile the code accordingly. So I’m not sure if the problem is that processor is one that we no longer support so the compiler is not detecting it, it’s somehow miss detecting the processor due to the container, or something in our runtime is not guarded so a new instruction is being used even on an older processor.

Can you provide the output from the command “nvcpuid” so we can see what compiler is detecting as the processor? Is the info correct? If not, what’s the actual processor being used?

What happens if you manually set the target processor flag (-tp) as shown in the nvcpuid output? or if you use “-tp px” (target generic x86)?

-Mat

Thanks for your reply, the output of nvcpuid from 21.7 install is:

> nvcpuid
vendor id       : GenuineIntel
model name      : Intel(R) Xeon(R) CPU           E5540  @ 2.53GHz
cpu family      : 6
model           : 26
name            : Nehalem 45nm
stepping        : 5
processors      : 16
threads         : 2
clflush size    : 8
L2 cache size   : 256KB
L3 cache size   : 8192KB
flags           : acpi apic cflush cmov cplds cx8 cx16 de dtes ferr fpu fxsr
flags           : ht lm mca mce mmx monitor msr mtrr nx pae pat pdcm pge
flags           : popcnt pse pseg36 selfsnoop speedstep sep sse sse2 sse3
flags           : ssse3 sse4.1 sse4.2 syscall tm tm2 tsc vme xtpr
default target  : -tp nehalem

This is identical to the working 21.5 container. Trying to set -tp option doesn’t help I’m afraid:

/host/test> nvfortran test.f90
/host/test> ./a.out 
Illegal instruction (core dumped)
/host/test> nvfortran -tp px test.f90
/host/test> ./a.out 
Illegal instruction (core dumped)
host/test> nvfortran -tp nehalem test.f90
/host/test> ./a.out 
Illegal instruction (core dumped)

All of those cases run fine with 21.5.

Andy

Hi Andy,

I talked with engineering and it looks like they discontinued support for non-AVX enabled x86_64 processors so you wont be able to use 21.7 with this system. They did miss documenting this and putting the appropriate checks in the compiler drivers for which we apologize and will get corrected.

-Mat

Mat,

Thanks very much for looking into this for me, now I know it’s expected behaviour to fail for non-AVX I’ll look at moving the container to a different machine.

Andy

For the record, there seems to be a similar issue with at least the math builtins for C and C++. So, I’ll describe the issue here with the expectation that it might help another user find this issue in a search.

I have seen the exact same issue with Fortran on a Nahalem system. In addition to that, we see that a near-trivial (silly in this example) use of math libs from C or C++ fails on (only) such older systems as follows:

$ cat badmath.c
#include <math.h>
int main(int argc) { return (int)sin((double)argc); }
$ nvc -lm badmath.c && ./a.out
Error during math dispatch processing...
__nvmath_abort:Math dispatch table is either misconfigured or corrupted.

I can (so far) work-around this using -Mnobuiltin. However, based on Mat’s statement that CPUs of this age are no longer supported, I am not going to assume that is a complete fix for the issue.