nested parallelism in openmp

Hi All,

I am trying to implement nested parallelism in my Fortran95 code but found that the logical value returned by omp_get_nested is always False despite that I have call omp_set_nested(.true.) at the beginning of the code. I am wondering if this means the PGI Fortran compiler I have (version linux86-64 11.5) is old and does not support nested parallelism? Or is it because the openmp implementation I have (1.4.3) is old and does not support nested parallelism?

Thanks,

Guangyu

Hi Guangyu,

We added OpenMP nested support in the PGI 2012 compilers so your version is a bit too old.

Also, nested isn’t enabled by default. To enable, set the environment variable “OMP_NESTED=1”.

Hope this helps,
Mat

Hi Mat,

Sorry for taking so long to reply. I tried the 2014 PGI compiler and got some confusing results. First of all, the new compiler appears to have the nested parallelization function, which can be turned on and off by setting the environmental variable you mentioned or using the run time OpenMPI function OMP_set_nested(.true.). However, I tried the new compiler on a couple of test programs to see if nesting is actually working and the results are negative even though I have explicitly turned on the nesting function. I read somewhere in a document saying the nested parallelism is also OpenMP-implementation dependent. It says this means the OpenMP implementation is allowed to serialize nested parallel regions even when nested parallelism is enabled. Therefore, I am wondering if this suggests the OpenMP version I have: 1.4.3 does not support nested parallelism.

Thanks,

Guangyu

Hi Guangyu,

Apologies I forgot to mention that you also need to set the environment “OMP_MAX_ACTIVE_LEVELS=<num_levels>” (or make a call to " omp_set_max_active_levels") to the number of nested levels you want.

  • Mat

Thanks a lot for the prompt reply Mat. I just tried it but nesting still does not seem to work. The following is the simple test code I am using. Note that there are two layers of OMP PARALLEL. I assigned two threads to the first layer and four to the second layer. I think if nesting is working, the code should display 2 “Hello” and 8 “Hi” on the screen. However, I got 2 “Hello” and 2 “hi” instead, which suggests the nested parallel regions are not working properly for some reason. I am hoping this is caused by some error in my code rather than software or hardware limitations. Maybe you can spot the error?

program gxu_test1
use omp_lib
call OMP_set_num_threads(12)
call OMP_set_nested(.true.)
call OMP_set_max_active_levels(2)
print *, omp_get_max_active_levels()
print *, OMP_get_nested()
!$OMP PARALLEL NUM_THREADS(2)
print *, “Hello”, OMP_get_thread_num()
!$OMP PARALLEL NUM_THREADS(4)
print *, "Hi ", OMP_get_thread_num()
!$OMP END PARALLEL
!$OMP END PARALLEL
end program gxu_test1

Hi xupeng66,

Ok, I see the difference now. We support nested parallelism when the parallel directives are in separate routines. For example:

% cat testnest.f90
subroutine foo ()
use omp_lib
!$OMP PARALLEL NUM_THREADS(4)
print *, "Hi from foo ", OMP_get_thread_num()
!$OMP END PARALLEL
end subroutine foo
 
program gxu_test1
use omp_lib
call OMP_set_num_threads(12)
call OMP_set_nested(.true.)
call OMP_set_max_active_levels(2)
print *, omp_get_max_active_levels()
print *, OMP_get_nested()
!$OMP PARALLEL NUM_THREADS(2)
print *, "Hello", OMP_get_thread_num()
call foo()
!$OMP PARALLEL NUM_THREADS(4)
print *, "Hi ", OMP_get_thread_num()
!$OMP END PARALLEL
!$OMP END PARALLEL
end program gxu_test1
 
% pgfortran -mp testnest.f90; a.out
            2
  T
Hello            0
Hello            1
Hi from foo             0
Hi from foo             1
Hi from foo             2
Hi from foo             3
Hi from foo             0
Hi from foo             3
Hi             0
Hi from foo             2
Hi from foo             1
Hi             0
  • Mat

Thanks a lot, Mat. It is finally working! This is Great. I will implement nesting in the model I am running and see how much faster it is going to get. Ha, so excited!