CUDA 3.1 crashes

Hello,

CUDA 3.1 compiler crashes trying to compile my projects. What is worth to check?

/home/john/projects/CPM/include/WarpStandard.cuh(78): Error: Unaligned memory accesses not supported

*** glibc detected *** /usr/local/cuda/open64/lib//be: double free or corruption (out): 0x00000000015d05e0 ***

======= Backtrace: =========

/lib/libc.so.6[0x2ade921e62f6]

/lib/libc.so.6(cfree+0x6c)[0x2ade921eac6c]

/usr/local/cuda/open64/lib//be[0x5e3ef5]

/lib/libc.so.6(exit+0xe2)[0x2ade921a8c12]

/usr/local/cuda/open64/lib//be[0x6d546c]

/usr/local/cuda/open64/lib//be[0x6d5a1c]

/usr/local/cuda/open64/lib//be[0x54012d]

/usr/local/cuda/open64/lib//be[0x54142c]

/usr/local/cuda/open64/lib//be[0x571064]

/usr/local/cuda/open64/lib//be[0x571c15]

/usr/local/cuda/open64/lib//be[0x57251e]

/usr/local/cuda/open64/lib//be[0x57251e]

/usr/local/cuda/open64/lib//be[0x57163f]

/usr/local/cuda/open64/lib//be[0x57163f]

/usr/local/cuda/open64/lib//be[0x572ff1]

/usr/local/cuda/open64/lib//be[0x5720f2]

/usr/local/cuda/open64/lib//be[0x57679b]

/usr/local/cuda/open64/lib//be[0x5772c9]

/usr/local/cuda/open64/lib//be[0x5779ac]

/usr/local/cuda/open64/lib//be[0x553392]

/usr/local/cuda/open64/lib//be[0x405443]

/usr/local/cuda/open64/lib//be[0x4061f1]

/usr/local/cuda/open64/lib//be[0x40752d]

/lib/libc.so.6(__libc_start_main+0xfd)[0x2ade9218eabd]

/usr/local/cuda/open64/lib//be[0x4038da]

======= Memory map: ========

00400000-0084b000 r-xp 00000000 08:06 1715807							/usr/local/cuda/open64/lib/be

0094a000-0096e000 rw-p 0044a000 08:06 1715807							/usr/local/cuda/open64/lib/be

0096e000-00ecb000 rw-p 00000000 00:00 0 

00f93000-02385000 rw-p 00000000 00:00 0								  [heap]

2ade917a4000-2ade917c3000 r-xp 00000000 08:06 2405					   /lib/ld-2.10.1.so

2ade917c3000-2ade917c6000 rw-p 00000000 00:00 0 

2ade919c2000-2ade919c3000 r--p 0001e000 08:06 2405					   /lib/ld-2.10.1.so

2ade919c3000-2ade919c4000 rw-p 0001f000 08:06 2405					   /lib/ld-2.10.1.so

2ade919c4000-2ade91ab6000 r-xp 00000000 08:06 6145					   /usr/lib/libstdc++.so.6.0.13

2ade91ab6000-2ade91cb6000 ---p 000f2000 08:06 6145					   /usr/lib/libstdc++.so.6.0.13

2ade91cb6000-2ade91cbd000 r--p 000f2000 08:06 6145					   /usr/lib/libstdc++.so.6.0.13

2ade91cbd000-2ade91cbf000 rw-p 000f9000 08:06 6145					   /usr/lib/libstdc++.so.6.0.13

2ade91cbf000-2ade91cd4000 rw-p 00000000 00:00 0 

2ade91cd4000-2ade91d56000 r-xp 00000000 08:06 2416					   /lib/libm-2.10.1.so

2ade91d56000-2ade91f56000 ---p 00082000 08:06 2416					   /lib/libm-2.10.1.so

2ade91f56000-2ade91f57000 r--p 00082000 08:06 2416					   /lib/libm-2.10.1.so

2ade91f57000-2ade91f58000 rw-p 00083000 08:06 2416					   /lib/libm-2.10.1.so

2ade91f58000-2ade91f6e000 r-xp 00000000 08:06 1334					   /lib/libgcc_s.so.1

2ade91f6e000-2ade9216d000 ---p 00016000 08:06 1334					   /lib/libgcc_s.so.1

2ade9216d000-2ade9216e000 r--p 00015000 08:06 1334					   /lib/libgcc_s.so.1

2ade9216e000-2ade9216f000 rw-p 00016000 08:06 1334					   /lib/libgcc_s.so.1

2ade9216f000-2ade92170000 rw-p 00000000 00:00 0 

2ade92170000-2ade922d6000 r-xp 00000000 08:06 2410					   /lib/libc-2.10.1.so

2ade922d6000-2ade924d6000 ---p 00166000 08:06 2410					   /lib/libc-2.10.1.so

2ade924d6000-2ade924da000 r--p 00166000 08:06 2410					   /lib/libc-2.10.1.so

2ade924da000-2ade924db000 rw-p 0016a000 08:06 2410					   /lib/libc-2.10.1.so

2ade924db000-2ade924e2000 rw-p 00000000 00:00 0 

2ade92b79000-2ade92b7a000 rw-p 00000000 00:00 0 

2ade92b7a000-2ade92c1c000 rw-p 00000000 00:00 0 

2ade92c1c000-2ade938db000 rw-p 00000000 00:00 0 

2ade938db000-2ade938dc000 rw-p 00000000 00:00 0 

2ade94000000-2ade94021000 rw-p 00000000 00:00 0 

2ade94021000-2ade98000000 ---p 00000000 00:00 0 

7fffc98f4000-7fffc9909000 rw-p 00000000 00:00 0						  [stack]

7fffc9984000-7fffc9985000 r-xp 00000000 00:00 0						  [vdso]

ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0				  [vsyscall]

Signal: Aborted in Code_Expansion phase.

<input>(0): Error: Signal Aborted in phase Code_Expansion -- processing aborted

*** Internal stack backtrace:

	/usr/local/cuda/open64/lib//be [0x6d47bf]

	/usr/local/cuda/open64/lib//be [0x6d5409]

	/usr/local/cuda/open64/lib//be [0x6d4b5d]

	/usr/local/cuda/open64/lib//be [0x6d5da6]

	/lib/libc.so.6 [0x2ade921a3530]

	/lib/libc.so.6(gsignal+0x35) [0x2ade921a34b5]

	/lib/libc.so.6(abort+0x180) [0x2ade921a6f50]

	/lib/libc.so.6 [0x2ade921dc1b7]

	/lib/libc.so.6 [0x2ade921e62f6]

	/lib/libc.so.6(cfree+0x6c) [0x2ade921eac6c]

	/usr/local/cuda/open64/lib//be [0x5e3ef5]

	/lib/libc.so.6(exit+0xe2) [0x2ade921a8c12]

	/usr/local/cuda/open64/lib//be [0x6d546c]

	/usr/local/cuda/open64/lib//be [0x6d5a1c]

	/usr/local/cuda/open64/lib//be [0x54012d]

	/usr/local/cuda/open64/lib//be [0x54142c]

	/usr/local/cuda/open64/lib//be [0x571064]

	/usr/local/cuda/open64/lib//be [0x571c15]

	/usr/local/cuda/open64/lib//be [0x57251e]

	/usr/local/cuda/open64/lib//be [0x57251e]

	/usr/local/cuda/open64/lib//be [0x57163f]

	/usr/local/cuda/open64/lib//be [0x57163f]

	/usr/local/cuda/open64/lib//be [0x572ff1]

	/usr/local/cuda/open64/lib//be [0x5720f2]

	/usr/local/cuda/open64/lib//be [0x57679b]

	/usr/local/cuda/open64/lib//be [0x5772c9]

	/usr/local/cuda/open64/lib//be [0x5779ac]

	/usr/local/cuda/open64/lib//be [0x553392]

	/usr/local/cuda/open64/lib//be [0x405443]

	/usr/local/cuda/open64/lib//be [0x4061f1]

	/usr/local/cuda/open64/lib//be [0x40752d]

nvopencc INTERNAL ERROR: /usr/local/cuda/open64/lib//be died due to signal 4

CMake Error at CMakeFiles/potts_generated_potts.cu.o.cmake:258 (message):

  Error generating file /home/john/projects/CPM/./potts_generated_potts.cu.o

Can you post a repro? It shouldn’t crash.

Can you post a repro? It shouldn’t crash.

Finally we got one file that cause the compiler crash, but I can’t upload it because of “Upload failed. Please ask the administrator to check the settings and permissions” error.

Finally we got one file that cause the compiler crash, but I can’t upload it because of “Upload failed. Please ask the administrator to check the settings and permissions” error.

Thanks… we received the repro case and will investigate the crash issue.

In the meanwhile, it looks like the crash happens after the compiler reports “Error: Unaligned memory accesses not supported.” Since you’d need to fix that error before proceeding anyway, it seems from my testing that if you move your “extern shared unsigned rngShmem” declaration out of the global function and into the file scope (referencing it in the device function directly rather than passing a pointer to it into the device function), then the unaligned memory access error – and consequently the crash – goes away.

Hope this helps,

Cliff

Thanks… we received the repro case and will investigate the crash issue.

In the meanwhile, it looks like the crash happens after the compiler reports “Error: Unaligned memory accesses not supported.” Since you’d need to fix that error before proceeding anyway, it seems from my testing that if you move your “extern shared unsigned rngShmem” declaration out of the global function and into the file scope (referencing it in the device function directly rather than passing a pointer to it into the device function), then the unaligned memory access error – and consequently the crash – goes away.

Hope this helps,

Cliff