Hi,
I have a code that works perfectly fine with pgi and openacc and target “-ta=tesla:cc70”.
I am now trying to port it to unified memory, but I get a segmentation fault in a pgi function.
This is how I execute the code:
module load pgi/18.7 cuda/9.2
mpirun -n 1 cuda-memcheck P-Gadget3/P-Gadget3 param.par
These are my compiling options:
mpipgic++ -acc -ta=tesla:managed --c++11 -g -O2 -mp -Minfo -Mlarge_arrays ...
This is the output from my “cuda-gdb”:
05:07:26 b8p190ai@p10login1:~/testo/Gennaio2019_pragma/Box4mr/1Node/1MPI/1openMP_1GPU_um_2 cuda-gdb P-Gadget3/P-Gadget3 core.152478
NVIDIA (R) CUDA Debugger
9.2 release
Portions Copyright (C) 2007-2018 NVIDIA Corporation
GNU gdb (GDB) 7.12
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "ppc64le-elf-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from P-Gadget3/P-Gadget3...done.
warning: exec file is newer than core file.
[New LWP 152478]
[New LWP 152506]
[New LWP 152479]
[New LWP 152505]
[New LWP 152480]
warning: Cannot parse .gnu_debugdata section; LZMA support was disabled at compile time
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
warning: Cannot parse .gnu_debugdata section; LZMA support was disabled at compile time
warning: Cannot parse .gnu_debugdata section; LZMA support was disabled at compile time
warning: Cannot parse .gnu_debugdata section; LZMA support was disabled at compile time
warning: Cannot parse .gnu_debugdata section; LZMA support was disabled at compile time
warning: Cannot parse .gnu_debugdata section; LZMA support was disabled at compile time
warning: Cannot parse .gnu_debugdata section; LZMA support was disabled at compile time
warning: Cannot parse .gnu_debugdata section; LZMA support was disabled at compile time
warning: Cannot parse .gnu_debugdata section; LZMA support was disabled at compile time
warning: Cannot parse .gnu_debugdata section; LZMA support was disabled at compile time
Core was generated by `P-Gadget3/P-Gadget3 param.par '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00002000011c7da4 in __c_mcopy4 () from /gpfs/gpfs_gl4_16mb/pgi/18.7/linuxpower/18.7/lib/libpgc.so
[Current thread is 1 (Thread 0x200001830000 (LWP 152478))]
(cuda-gdb) info threads
Id Target Id Frame
* 1 Thread 0x200001830000 (LWP 152478) 0x00002000011c7da4 in __c_mcopy4 () from /gpfs/gpfs_gl4_16mb/pgi/18.7/linuxpower/18.7/lib/libpgc.so
2 Thread 0x20001a36f050 (LWP 152506) 0x0000200001467ad8 in poll () from /lib64/libc.so.6
3 Thread 0x200001d0f050 (LWP 152479) 0x0000200001467ad8 in poll () from /lib64/libc.so.6
4 Thread 0x200019f5f050 (LWP 152505) 0x000020000147b31c in accept4 () from /lib64/libc.so.6
5 Thread 0x20000227f050 (LWP 152480) 0x0000200001479398 in epoll_wait () from /lib64/libc.so.6
(cuda-gdb) bt
#0 0x00002000011c7da4 in __c_mcopy4 () from /gpfs/gpfs_gl4_16mb/pgi/18.7/linuxpower/18.7/lib/libpgc.so
#1 0x0000200000cf7314 in __pgi_uacc_fill_buffer () from /gpfs/gpfs_gl4_16mb/pgi/18.7/linuxpower/18.7/lib/libaccg2mp.so
#2 0x0000200000ba9958 in __pgi_uacc_cuda_dataup1 () from /gpfs/gpfs_gl4_16mb/pgi/18.7/linuxpower/18.7/lib/libaccnmp.so
#3 0x0000200000b62cd0 in __pgi_uacc_dataup1 () from /gpfs/gpfs_gl4_16mb/pgi/18.7/linuxpower/18.7/lib/libaccgmp.so
#4 0x0000200000b6351c in __pgi_uacc_dataupx () from /gpfs/gpfs_gl4_16mb/pgi/18.7/linuxpower/18.7/lib/libaccgmp.so
#5 0x0000200000b61310 in __pgi_uacc_dataonb () from /gpfs/gpfs_gl4_16mb/pgi/18.7/linuxpower/18.7/lib/libaccgmp.so
#6 0x0000000010008048 in density () at Hydro/density.c:806
#7 0x000000001003ae68 in init () at CodeBase/init.c:1151
#8 0x000000001002e1d0 in begrun () at CodeBase/begrun.c:274
#9 0x00000000100cb4bc in main (argc=<optimized out>, argv=0x7fffd8ee14c8) at CodeBase/main.c:286
(cuda-gdb) f 6
#6 0x0000000010008048 in density () at Hydro/density.c:806
806 for(i=0;i<NDensityActivePart;i++){
(cuda-gdb) l
801 #ifdef ACC_DENSITY
802 ACC_PRAGMA_DATA( ACC_COPYIN( DensityActiveParticleList[0:NDensityActivePart], LiteDIn[0:NumPart],WalkLiteP[0:NumPart],WalkLiteSPH[0:NumPart] ) ACC_COPY(LiteDOut[0:NDensityActivePart]) if(DoPrimaryOnGPU))
803 { //this parenthesys is contains the density data region. It is closed at the end of the iterations.
804 if(DoPrimaryOnGPU){
805 #pragma acc parallel loop
806 for(i=0;i<NDensityActivePart;i++){
807 int MiniNGBList[ACC_GPU_NGB];
808 int p_target=DensityActiveParticleList[i];
809 int startnode = All.MaxPart;
810 int endnode = startnode;
(cuda-gdb)
As you can see, the “#pragma” statement in line 805 is pretty innocent. Additionally, the code works perfectly well without the “:managed” option in the target flag.
What puzzles me is that the error comes from a pgi routine “__c_mcopy4”, so it looks like there is a copy being made somehow?
How can I investigate further what’s happening and what is causing the error?
Thanks.