I’m trying to port some projects over to PGI, but I keep running into problems with atomics and thread-local storage. First off, TLS…
Is there any way to determine whether -c11 was passed to the compiler or, better yet, whether TLS is supported? It would also be acceptable to use a PGI-specific construct, if one exists (preferable, even, if it doesn’t require a special flag). Basically, I’m trying to port something like this to PGI:
#if defined(_Thread_local) || (defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 201102L))
# define THREAD_LOCAL _Thread_local
#elif defined(__GNUC__) || defined(__INTEL_COMPILER) || defined(__SUNPRO_CC) || defined(__IBMCPP__)
# define THREAD_LOCAL __thread
#elif defined(_WIN32)
# define THREAD_LOCAL __declspec(thread)
#else
# error No TLS implementation found.
#endif
_Thread_local seems to work with PGI, but ONLY if -c11 is passed.
STDC_VERSION is defined as 199901L even if -c11 is passed, so my current code emits an error. I can understand that; PGI’s C11 support is still incomplete, so advertising it in STDC_VERSION would be premature. Unfortunately, though, it puts me in a tough spot… AFAICT there is no way to tell in the preprocessor whether the compiler supports _Thread_local.
Normally I would use PGIC, PGIC_MINOR, and PGIC_PATCHLEVEL to check for support (and ignore STDC_VERSION), but since TLS only works in C11 mode (unlike other compilers) that doesn’t do much good. I’ve also tried using C11 macros like STDC_NO_THREADS and STDC_NO_ATOMICS in hopes of detecting whether PGI is in C11 mode, but unlike _Thread_local they’re defined in PGI’s C99 mode, too.
As for atomics, is there some variant of atomics which PGI supports that I’m missing? I already have support for
- Old GCC-style (_sync*)
- New GCC-style (_atomic*)
- clang-style (_c11*)
- C11-style (stdatomic.h)
- MS-style (Interlocked*)
I’m happy to add another method, but I can’t seem to figure out how to do atomics in PGI. So far my best guess is to require OpenACC or maybe OpenMP, but that’s some pretty significant overhead and I’d strongly prefer something which doesn’t require a compiler flag; this is for a reusable header which you can currently just drop into any C project and be done with it.
Also, I have a PRNG which requires CAS (or a lock), but I don’t see a way to do an atomic compare and swap with OpenACC. This seems like an odd omission… am I missing something, or should I just fall back on a spinlock for OpenACC?