Hi,
I get a segmentation fault in pthread_mutex_lock () when trying to run my openacc code on gpu. On the cpu it runs fine. The segmentation fault happens right at the first acc data copy directive. Do you have any idea what might be wrong ? I can provide further information or access to the code if necessary.
Thank you and regards,
Thomas
Thread 1 “ftg_vdiff_up_te” received signal SIGSEGV, Segmentation fault.
0x00002aaabcce7714 in pthread_mutex_lock () from /lib64/libpthread.so.0
Missing separate debuginfos, use: zypper install libjasper1-debuginfo-1.900.14-195.3.1.x86_64 libjpeg62-debuginfo-62.1.0-30.1.x86_64 libjpeg8-debuginfo-8.0.2-30.3.x86_64 liblzma5-debuginfo-5.0.5-4.852.x86_64 libnuma1-debuginfo-2.0.9-9.1.x86_64 libpython2_7-1_0-debuginfo-2.7.13-27.1.x86_64 libxml2-2-debuginfo-2.9.4-46.3.2.x86_64 libz1-debuginfo-1.2.8-11.1.x86_64
bt
(gdb) #0 0x00002aaabcce7714 in pthread_mutex_lock () from /lib64/libpthread.so.0
#1 0x00002aaaacdaff88 in ?? ()
from /usr/lib64/gcc/x86_64-suse-linux/4.8/…/…/…/…/lib64/libcuda.so.1
#2 0x00002aaaace66471 in ?? ()
from /usr/lib64/gcc/x86_64-suse-linux/4.8/…/…/…/…/lib64/libcuda.so.1
#3 0x00002aaaace665e5 in ?? ()
from /usr/lib64/gcc/x86_64-suse-linux/4.8/…/…/…/…/lib64/libcuda.so.1
#4 0x00002aaaacdb5eb4 in ?? ()
from /usr/lib64/gcc/x86_64-suse-linux/4.8/…/…/…/…/lib64/libcuda.so.1
#5 0x00002aaaacdb7707 in ?? ()
from /usr/lib64/gcc/x86_64-suse-linux/4.8/…/…/…/…/lib64/libcuda.so.1
#6 0x00002aaaacd8a266 in ?? ()
from /usr/lib64/gcc/x86_64-suse-linux/4.8/…/…/…/…/lib64/libcuda.so.1
#7 0x00002aaaacdd79ed in cuInit ()
from /usr/lib64/gcc/x86_64-suse-linux/4.8/…/…/…/…/lib64/libcuda.so.1
#8 0x00002aaaac9a9dd5 in ?? ()
from /apps/common/UES/pgi/17.10/linux86-64/2017/cuda/8.0/lib64/libcudart.so.8.0
#9 0x00002aaaac9a9e31 in ?? ()
from /apps/common/UES/pgi/17.10/linux86-64/2017/cuda/8.0/lib64/libcudart.so.8.0
#10 0x00002aaabcce3c13 in __pthread_once_slow () from /lib64/libpthread.so.0
#11 0x00002aaaac9dc919 in ?? ()
from /apps/common/UES/pgi/17.10/linux86-64/2017/cuda/8.0/lib64/libcudart.so.8.0
#12 0x00002aaaac9a600a in ?? ()
from /apps/common/UES/pgi/17.10/linux86-64/2017/cuda/8.0/lib64/libcudart.so.8.0
#13 0x00002aaaac9a9ceb in ?? ()
from /apps/common/UES/pgi/17.10/linux86-64/2017/cuda/8.0/lib64/libcudart.so.8.0
#14 0x00002aaaac9cbd2a in cudaFree ()
from /apps/common/UES/pgi/17.10/linux86-64/2017/cuda/8.0/lib64/libcudart.so.8.0
#15 0x00002aaaae61aeaf in __pgi_uacc_cuda_initdev ()
from /apps/common/UES/pgi/17.10/linux86-64/17.10/lib/libaccncmp.so
#16 0x00002aaaae402eaa in __pgi_uacc_enumerate ()
from /apps/common/UES/pgi/17.10/linux86-64/17.10/lib/libaccgmp.so
#17 0x00002aaaae4033c3 in __pgi_uacc_initialize ()
from /apps/common/UES/pgi/17.10/linux86-64/17.10/lib/libaccgmp.so
#18 0x00002aaaae3f9e3b in __pgi_uacc_dataenterstart ()
from /apps/common/UES/pgi/17.10/linux86-64/17.10/lib/libaccgmp.so
#19 0x000000000041f0fd in mo_vdiff_upward_sweep::vdiff_up (
kproma=, kbdim=, klev=,
klevm1=, ktrac=, ksfc_type=,
idx_wtr=, pdtime=, pfrc=…, pcfm_tile=…,
aa=…, pcptgz=…, pum1=…, pvm1=…, ptm1=…, pmair=…, pmdry=…,
pqm1=…, pxlm1=…, pxim1=…, pxtm1=…, pgeom1=…, pztkevn=…,
bb=…, pzthvvar=…, pxvar=…, pz0m_tile=…, pkedisp=…, pute_vdf=…,
pvte_vdf=…, pq_vdf=…, pqte_vdf=…, pxlte_vdf=…, pxite_vdf=…,
pxtte_vdf=…, pz0m=…, pthvvar=…, ptke=…, psh_vdiff=…,
pqv_vdiff=…) at …/…/…/src/mo_vdiff_upward_sweep.f90:141
#20 0x000000000040d629 in ftg_test_vdiff_up ()
==24456== Invalid read of size 4
==24456== at 0x16E4E714: pthread_mutex_lock (in /lib64/libpthread-2.22.so)
==24456== by 0x6F16F87: ??? (in /usr/lib64/libcuda.so.375.74)
==24456== by 0x6FCD470: ??? (in /usr/lib64/libcuda.so.375.74)
==24456== by 0x6FCD5E4: ??? (in /usr/lib64/libcuda.so.375.74)
==24456== by 0x6F1CEB3: ??? (in /usr/lib64/libcuda.so.375.74)
==24456== by 0x6F1E706: ??? (in /usr/lib64/libcuda.so.375.74)
==24456== by 0x6EF1265: ??? (in /usr/lib64/libcuda.so.375.74)
==24456== by 0x6F3E9EC: cuInit (in /usr/lib64/libcuda.so.375.74)
==24456== by 0x6B10DD4: ??? (in /apps/common/UES/pgi/17.10/linux86-64/2017/cuda/8.0/lib64/libcudart.so.8.0.44)
==24456== by 0x6B10E30: ??? (in /apps/common/UES/pgi/17.10/linux86-64/2017/cuda/8.0/lib64/libcudart.so.8.0.44)
==24456== by 0x16E4AC12: __pthread_once_slow (in /lib64/libpthread-2.22.so)
==24456== by 0x6B43918: ??? (in /apps/common/UES/pgi/17.10/linux86-64/2017/cuda/8.0/lib64/libcudart.so.8.0.44)
==24456== Address 0x3038 is not stack’d, malloc’d or (recently) free’d
==24456==
==24456==
==24456== Process terminating with default action of signal 11 (SIGSEGV)
==24456== Access not within mapped region at address 0x3038
==24456== at 0x16E4E714: pthread_mutex_lock (in /lib64/libpthread-2.22.so)
==24456== by 0x6F16F87: ??? (in /usr/lib64/libcuda.so.375.74)
==24456== by 0x6FCD470: ??? (in /usr/lib64/libcuda.so.375.74)
==24456== by 0x6FCD5E4: ??? (in /usr/lib64/libcuda.so.375.74)
==24456== by 0x6F1CEB3: ??? (in /usr/lib64/libcuda.so.375.74)
==24456== by 0x6F1E706: ??? (in /usr/lib64/libcuda.so.375.74)
==24456== by 0x6EF1265: ??? (in /usr/lib64/libcuda.so.375.74)
==24456== by 0x6F3E9EC: cuInit (in /usr/lib64/libcuda.so.375.74)
==24456== by 0x6B10DD4: ??? (in /apps/common/UES/pgi/17.10/linux86-64/2017/cuda/8.0/lib64/libcudart.so.8.0.44)
==24456== by 0x6B10E30: ??? (in /apps/common/UES/pgi/17.10/linux86-64/2017/cuda/8.0/lib64/libcudart.so.8.0.44)
==24456== by 0x16E4AC12: __pthread_once_slow (in /lib64/libpthread-2.22.so)
==24456== by 0x6B43918: ??? (in /apps/common/UES/pgi/17.10/linux86-64/2017/cuda/8.0/lib64/libcudart.so.8.0.44)