I am working on parallelizing a scientifc simulation package with a rather unusual program setup. This is being run in 32-bit mode on a quad Opteron machine, 32GB RAM, running SuSE EL9 with the PGI v6.1 compilers. A Java-based user-interface allows the user to select various types of simulation parameters, launch a series of calculations, and then view the results in various graphics formats. For these series of calculations, several different shared object libraries are loaded by Java as needed, performing the nitty-gritty simulation number crunching. Each of these shared object libraries are a mix of F77, F90 and C, and with a few C++ wrapper routines for the Java interfacing.
I have been successful in parallelizing these shared object libraries, in a standalone version (without the Java and C++).
The link statement for each of the shared object libraries can be condensed to this :
pgCC -shared -fPIC -L/usr/pgi/linux86/6.1/lib <long_list_of_-mp_-fPIC_compiled_object_files> -mp -lpgftnrtl -lm -lpgc -lgcc -lc -pgf90libs -o libSimDLL.so
Now when I try to run the calculations via the Java front end, with NCPUS=4 and OMP_NUM_THREADS, I get the
“Warning: OMP_NUM_THREADS or NCPUS (4) greater than available cpus (1)”
message. The code continues to execute, but in only single threaded mode. A call within the code to ‘OMP_get_num_procs’ returns ‘1’.
When NCPUS and NUM_THREAD_PROCS=1, the same code runs normally (no warning message), and the call to ‘OMP_get_num_procs’ returns ‘4’.
I have read elsewhere in this forum that I can avoid the warning by dropping the ‘-lpgc’ from the link statement. In doing so, running with OMP_NUM_THREADS and NCPUS=4 results in a crash, with the message
“**ERROR: in a parallel region there is a stack overflow: thread 0, max 8180KB, used 0KB, request 176B”
, displayed a total of three times. Although it appears to have not really overflowed the stack, I changed the stack limit to ‘unlimited’, but this did not prevent the crash. Nor did setting MPSTKZ to 8M.
Is there a recommended ordering of the libraries in the link statement? Could there be something in the Java side that could be affecting this?
My current pseudo-workaround solution is to bootstrap a standalone version to run in parallel mode, but this cuts the communication with the Java user interface.