Does nvcc default to -fno-strict-aliasing behavior?

I’m trying to understand how nvcc handles the C++ strict aliasing rule.

From what I’ve observed:

  • In standard C++, reinterpret_cast between unrelated types and accessing the same memory violates the strict aliasing rule, leading to undefined behavior.

  • However, in CUDA / nvcc, it seems common practice to do type punning via reinterpret_cast<uint4*> for 128-bit vectorized memory access, and this is widely used in performance-critical libraries like cutlass.

  • I’ve also noticed that some large projects (e.g., PyTorch) explicitly disable -Wstrict-aliasing when including CUDA headers, and third-party compiler projects (like SCALE) document that nvcc does not perform strict aliasing-based optimizations.

This makes me suspect that nvcc effectively defaults to -fno-strict-aliasing semantics — i.e., it does not aggressively optimize based on type-based alias analysis, and allows type punning via pointer casts to work reliably.

Is this understanding correct? Does nvcc intentionally relax the strict aliasing rule for GPU code? Or is there a more nuanced explanation?

Thanks in advance for any clarification!