__ATOMIC_HLE_RELEASE undefined - Fail to build OpenMPI with nvhpc 23.11 and 24.1 (success with 23.7)

Hello everyone,

I fail to build OpenMPI 4.1.5 with nvhpc 23.11 and 24.1. The make step fails with

"../../opal/include/opal/sys/gcc_builtin/atomic.h", line 258: error: identifier "__ATOMIC_HLE_RELEASE" is undefined
                         __ATOMIC_RELEASE | __ATOMIC_HLE_RELEASE);
                                            ^

All three tests (2 failing, 1 succeeds) are done in an CUDA 12 + GCC 12.3.0 environment (if this matters).

Does anyone has seen this issue before?

Best regards,
Jens Henrik

Hi Jens,

Try passing -mno-hle to nvc via CFLAGS to see if it works around the issue for you.

I think we ultimately ended up patching the Open MPI source code in our builds to fix this issue, as it does not handle nvc correctly. (It assumes that the __HLE__ macro being enabled implies that __ATOMIC_HLE_ACQUIRE and __ATOMIC_HLE_RELEASE are also defined – which is true for gcc, but not for nvc nor clang.)

Hope this helps.

+chris

Hello Chris,

thank you for this very helpful hints.
I tried to build OpenMPI with -mno-hle and it succeeds. :) :thumbsup:

While reading more on Hardware Lock Elision (like this article on LWN: Lock elision in the GNU C library [LWN.net] ) I come to the conclusion that disabling it can lead to a high performance decrease especially of OpenMPI.

By any chance, can you share your patch?

Best regards,
Jens Henrik

Yes, -mno-hle is a pretty big hammer. However, nvc doesn’t support all of the HLE locking macros that gcc does. Our developer explained to me that this is due to the fact that our compilers are based on LLVM, and LLVM does not have support for all the HLE functionality at present.

Here is a patch you can try:

diff -ur a/opal/include/opal/sys/gcc_builtin/atomic.h b/opal/include/opal/sys/gcc_builtin/atomic.h
--- a/opal/include/opal/sys/gcc_builtin/atomic.h        2023-02-22 20:25:04.000000000 -0800
+++ b/opal/include/opal/sys/gcc_builtin/atomic.h        2024-02-07 14:58:56.913208249 -0800
@@ -219,7 +219,7 @@
 
 #endif
 
-#if defined(__HLE__)
+#if defined(__HLE__) && defined(__ATOMIC_HLE_ACQUIRE) && defined(__ATOMIC_HLE_RELEASE)
 
 #include <immintrin.h>
 

cd to the top of your openmpi-4.1.5 source directory and apply this patch via:

patch -p1 < (patch file)

In essence, this file changes the macro test at line 222 of opal/include/opal/sys/gcc_builtin/atomic.h to avoid the calling the macros that nvc does not currently support. clang also does not support these macros, so nvc essentially follows the same path as clang here.

If you need the full HLE support, you may have to resort to configuring Open MPI as a hybrid build with CC=gcc and FC=nvfortran, and then fix up the wrapper configs afterward to make nvc the default C compiler for the MPI wrappers instead of gcc.

Hope this helps.

Good luck,

+chris

Great. Thank you
I will add your patch to our setup and will check if we need to go with OpenMPI build by GCC in the long term.

Best,
Jens Henrik