OpenMP/pthreads stack size?

Hi,

Is there any way to set the stack size for programs compiled with OpenMP?

I’ve got a global model that’s being compiled with pgf90 which I need to run faster. The code supports OpenMP and not MPI. It runs fine using a single CPU without OpenMP even with a stacksize limit of 10240 kbytes in the shell.

When I compile with OpenMP, it aborts with a segmentation violation due to stack overflows in subroutines that declare local arrays which are too large for the stack. Using -Mnorecursive (so local arrays aren’t put on the stack) makes no difference, nor does changing the shell’s stacksize limit to unlimited.

The code runs fine with OpenMP on other platforms.

I understand that the PGI runtime uses Linux pthreads when implementing OpenMP. Is there a way to communicate to the runtime the stack size to set in pthreads for each thread?

One obvious solution is to change the declaration of the large local arrays to allocatable, and indeed the conversion of a single subroutine eliminated its stack overflow. But a colleague familiar with the code said it would be easier to switch to another compiler than to change the declarations of the local arrays in all the subroutines.

(This code is being linked with -Bstatic to deal with the Linux memory issues.)

System details:
pgf90 6.1-6 32-bit target on x86 Linux
Red Hat Enterprise Linux AS release 3 (Taroon Update 8)

/usr/local/pgi/linux86/6.1/lib/libpgthread.a -> /usr/lib/libpthread.a
/usr/local/pgi/linux86/6.1/lib/libpgthread.so -> /lib/libpthread.so.0

Output from ‘locate libpthread’:
/usr/lib/libpthread_p.a
/usr/lib/nptl/libpthread.a
/usr/lib/nptl/libpthread_nonshared.a
/usr/lib/nptl/libpthread.so
/usr/lib/libpthread_nonshared.a
/usr/lib/libpthread.a
/usr/lib/libpthread.so
/usr/lib/i386-redhat-linux7/lib/libpthread.a
/usr/lib/i386-redhat-linux7/lib/libpthread.so
/lib/i686/libpthread-0.10.so
/lib/i686/libpthread.so.0
/lib/tls/libpthread-0.60.so
/lib/tls/libpthread.so.0
/lib/libpthread-0.10.so
/lib/libpthread.so.0

Update:

I’ve tried setting MPSTKZ to several values, to no avail.

I see exactly the same behavior on another system:
pgf90 5.2-4
SuSE Linux 9.1 (x86-64) (Opteron)

Hi Tod,

‘unlimited’ and MPSTKZ have an OS dependant limit and you are most likely reaching this maxium (typically 8MB). On 64-bit systems, you can override this maxium by explicitly setting the stacksize to a large value.

My recommendation for your 32-bit system would be to add the SAVE attribute to your shared arrays where it is safe to do so. This will have the effect of placing the array in the program’s BSS section instead of on the stack.

As as side note with the 7.0 release, we have added the proposed OpenMP 3.0 environment variable OMP_STACK_SIZE which allows you to explicitly set the OpenMP stack size. However, you are still limited by the OS’s maximum stack size.

Hope this helps,
Mat