Is _mm_storeu_si128 is generating aligned stores?

I came across this while trying to make a patch for Qt 5 to build with PGI. After a few source changes, I got qmake to build but it segfaults when run. I traced it back to a call to _mm_storeu_si128 unexpectedly raising a GPF from an aligned store instead of an unaligned store. I’ve got an isolated test case below:

#include <emmintrin.h>

int main(int argc, char **argv)
  // Fill register with dummy data
  __m128i xmm = _mm_set1_epi32(7);

  char buf[17];

  // At least one of these 2 calls is guaranteed to be unaligned

  _mm_storeu_si128(reinterpret_cast<__m128i*>(buf), xmm);
  _mm_storeu_si128(reinterpret_cast<__m128i*>(buf+1), xmm);

  return 0;

Now, since I’m using _mm_storeu_si128, and not _mm_store_si128, then my understanding is that it should always succeed, which it does with GCC, Clang, Intel, and Oracle compilers. The PGI compiler, however, generated a General Protection Fault which the debugger shows is from a movapd.

Note: I tested with 11.3 on RHEL5 and 14.7 on RHEL6.

Hi Chuck,

It looks like you sent this to PGI Customer Support ( who then reported it to our engineers as TPR#21212.

Best Regards,