Hi! I have a numerical model written in “do concurrent” form. It runs fine with -stdpar=multicore. After -stdpar=gpu is used, “Segmentation fault (core dumped)” occurs. I print things at some break points in the code and finally find an array DZR(11) triggers the error. DZR is defined as below:
Unfortunately, there’s not enough information here to know what’s going on, so if possible, please provide a minimal reproducing example and I’ll take a look.
Now a seg fault typically occurs on the host and this is a regular DO loop, so my assumption is that this is running on the host.
If this was being executed on the device, then the problem is likely due to “BD”. Allocated arrays are put into CUDA Managed Memory so they are accessible on both the host and device. However static arrays, like “BD”, still need to be managed via OpenACC data directives. The exception being on a system with HMM (like Grace Hopper), in which case Unified Memory makes all memory visible.
Though since this appears to be running of the host, my best guess is that the value of “KBM1” is getting corrupted. The arrays themselves seem to be ok since they print, so the only thing left would be an out-bounds access, meaning KBM1’s value could be bad.
Again, a reproducing example would be useful since I can see everything in context and give you a better answer.
I am deeply sorry that I was making a terrible mistake in compiling this code. “-stdpar=gpu” was mistyped as “mp=gpu”. After this is fixed, the model is immediately correct and running fast!
In my model code, there are only “do concurrent” and NOT even one openmp directive. So that error might still be interesting, although not important anymore.
Before I found my mistake, I was trying to use “-stdpar=gpu -acc=gpu -gpu=nomanaged” and manually control the data movement by using “!$ACC ENTER DATA COPYIN()”, “!$ACC UPDATE HOST()”, “!$ACC UPDATE DEVICE()”. This version is not yet correct because it’s complex. Fortunately, when I compile this version, I found my “mp=gpu” mistake.
No worries! If a segv happens when compiling with -mp, then one possible cause is a stack overflow. One side effect of enabling OpenMP is that automatics are allocated on the stack (i.e. the “-Mstack_arrays”) which can increase the needed stack size.
I personally always set my environment’s stack size to unlimited to avoid these, but you can also set the environment variable OMP_STACKSIZE to a large value.