Bug Report: __sync_val_compare_and_swap Erroneously Optimized Away

iandhenriksen · May 8, 2025, 7:58pm

Hi! I just ran across a bug in the nvc and nvc++ compilers. I’ve seen in some older C code an idiom where the old gcc intrinsic __sync_val_compare_and_swap is used to perform an atomic store and the result is discarded. It’s an odd idiom, but technically valid. Unfortunately if optimizations are on with the latest nvc and nvc++ (25.3), the compiler optimizes out the call entirely. The optimizer seems to think that __sync_val_compare_and_swap is a pure function or something similar. With C11 this is easily rewritten using atomic_store_explicit, and that is a viable workaround for me in this case, but I wanted to forward the error report because it may not be easy to track down the cause in other cases where this may show up.

Here’s a minimal example to reproduce the issue:

static volatile int val;

int main() {
    (void)__sync_val_compare_and_swap(&val, 0, 1);
}

Looking at the generated assembly, with -O1 set, the intrinsic is optimized out, but without optimization the correct code is generated. If you don’t want to look at the assembly directly you can also just read the value after the intrinsic and confirm whether the store actually happened.

Here are examples showing this behavior on the godbolt compiler explorer:
unoptimized (correct): Compiler Explorer
optimized (incorrect): Compiler Explorer

This bug is not present in earlier versions of nvc and nvc++, just 25.3.

MatColgrove · May 9, 2025, 3:37pm

Hi iandhenriksen and welcome,

I was able to track this down to a change where, for performance reasons, the compiler started inlining the compare and swap. Previously we’d call the libatomic version.

The problem here is that since “val” is volatile, the compiler shouldn’t be removing the instruction as part of it’s dead-code elimination. I filed a problem report, TPR #37379, and sent it to engineering for investigation.

Thanks for the report!
Mat

iandhenriksen · May 9, 2025, 3:48pm

Excellent. Thank you for forwarding this upstream!

Topic		Replies	Views
__atomic_compare_exchange_n bug in release mode, when building with nvc++ 22.3 nvc, nvc++ and nvfortran cuda , nvbugs	3	768	May 13, 2022
How to prevent compiler from optimizing operations away? CUDA Programming and Performance	11	3864	April 21, 2017
Atomic functions and volatile shared memory declarations. CUDA Programming and Performance	6	14193	December 14, 2013
NVCC potentially missing a memory optimization CUDA Programming and Performance cuda , nvcc	7	427	March 22, 2024
Compiler optimisation effecting code correctness CUDA Programming and Performance	5	788	November 25, 2011
Can't synchronize after atomic operations Compiler removes synchronization CUDA Programming and Performance	2	1376	January 23, 2010
Miscompilation of simple CPU code with nvc/21.7 nvc, nvc++ and nvfortran	3	719	January 6, 2022
NVCC silently compiles std::swap to incorrect code (with no error or warning) in certain scenarios CUDA NVCC Compiler nvbugs	2	145	February 15, 2025
volatile data Legacy PGI Compilers	3	1909	March 8, 2012
NVCC bug report: a runtime error CUDA Programming and Performance	7	6554	March 19, 2009

Bug Report: __sync_val_compare_and_swap Erroneously Optimized Away

Related topics