Complex OpenMP Failure

Gaetan · April 14, 2022, 4:38pm

Another bug to report. This one is actually relatively complex, and other compilers have had issues with this code as well, so it isn’t a surprise that there is an issue with nvfortran.

Basically, it is like a parallel STL vector class for OpenMP that allows each thread to append to a thread-local array and then at the end of the parallel region they are combined into a single output array.

Specifically, what appears to be happening, is the “t%sizes” array is erroneously de-allocated before it should be, leading to a seg-fault at line localSize = t%sizes(1, tid+1) - t%sizes(1, tid). Both ifort and gfortran run this code correctly. I was able to work around the issue by moving the allocate(outArr) directly into the openMP region and splitting the finalize command into two separate parts.

Thanks
Gaetan

codeMod.F90 (4.3 KB)
main.F90 (75 Bytes)
Makefile (241 Bytes)

MatColgrove · April 15, 2022, 6:58pm

Hi Gaetan,

I don’t think the problem is due to a deallocation of “t%size” but rather some issue with the allocation of “outArr”.

Here’s some modifications I made to the code (see attached). Adding a print statement which includes accesses to “t%sizes” before and after the allocation, at “-O0” the code segvs in the first access of “t%sizes”. However at -O2, the segv is delayed until the first access of outArr. This is an indication that the stack may be getting corrupted, but I’ll need engineering to investigate. Filed at TPR #31662.

Like you, I found hoisting the allocation of outArr outside of the subroutine works around the error.

codeMod.F90 (4.6 KB)

% nvfortran -mp codeMod.F90 main.F90 -O0 ; a.out
codeMod.F90:
main.F90:
 t1:                         0                  1000000
Segmentation fault
% nvfortran -mp codeMod.F90 main.F90 -O1 ; a.out
codeMod.F90:
main.F90:
 t1:                         0                  1000000
 t2:                         0
 t3:                   1000000
Segmentation fault
% nvfortran -mp codeMod.F90 main.F90 -O1 -DWORKS; a.out
codeMod.F90:
main.F90:
 size of outArr:      1000000

-Mat