I came across this while trying to make a patch for Qt 5 to build with PGI. After a few source changes, I got qmake to build but it segfaults when run. I traced it back to a call to _mm_storeu_si128 unexpectedly raising a GPF from an aligned store instead of an unaligned store. I’ve got an isolated test case below:
#include <emmintrin.h>
int main(int argc, char **argv)
{
// Fill register with dummy data
__m128i xmm = _mm_set1_epi32(7);
char buf[17];
// At least one of these 2 calls is guaranteed to be unaligned
_mm_storeu_si128(reinterpret_cast<__m128i*>(buf), xmm);
_mm_storeu_si128(reinterpret_cast<__m128i*>(buf+1), xmm);
return 0;
}
Now, since I’m using _mm_storeu_si128, and not _mm_store_si128, then my understanding is that it should always succeed, which it does with GCC, Clang, Intel, and Oracle compilers. The PGI compiler, however, generated a General Protection Fault which the debugger shows is from a movapd.
Note: I tested with 11.3 on RHEL5 and 14.7 on RHEL6.