PGI not vectorizing openmp loops

jwaltz · October 17, 2012, 4:06pm

I have a piece of F90 code that looks something like this (this is representative code only):

do i = 1, n1
do j = 1, n2
r(j,i) = r(j,i) + m(i)*s(j,i)
enddo
enddo

the variable ‘n2’ is generally 5 in length, and the variable n1 is generally around 10^6 in length. when I compile without OpenMP pragmas (v11.7). I get a compiler message that looks like this:

55, Generated vector sse code for the loop
Generated 2 prefetch instructions for the loop
Residual loop unrolled 1 times (completely unrolled)

so the compiler is vectorizing the outer-most loop. However when I add OpenMP pragmas around the outermost loop I get the following message:

52, Parallel region activated
55, Parallel loop activated with static block schedule
57, Loop not vectorized: loop count too small
Loop unrolled 5 times (completely unrolled)

These messages lead me to believe that the compiler will vectorize OR parallelize the loop, but not both. In other words, it does not parallelize the loop and then vectorize what gets executed on each thread. If this is true, it would mean that I am losing the benefits of vectorization when running under OpenMP.

can anyone confirm this, and if it is true, suggest a workaround?

MatColgrove · October 23, 2012, 5:47pm

Hi jwaltz,

We’ve been unsuccessful in recreating your issue so don’t have an answer for you. Can you please either post or send to PGI Customer Service (trs@pgroup.com) a reproducing example?

Thanks,
Mat

Topic		Replies	Views
OpenMP not parallelizing nested loop, depends on order Legacy PGI Compilers	1	2904	November 8, 2012
PGF95 won't vectorize loops -- "may not be beneficial&q Legacy PGI Compilers	3	4746	October 31, 2013
Decide on wheter parallelize or unroll a loop Legacy PGI Compilers	3	2452	November 5, 2015
New facet Legacy PGI Compilers	1	2016	October 4, 2012
Force a loop to vectorize Legacy PGI Compilers	6	4516	July 26, 2022
poor pgi openmp performance?? Legacy PGI Compilers	17	20503	August 3, 2012
Is there a way to vectorize this routine? Legacy PGI Compilers	6	48345	October 9, 2007
unrolling or data dependent loops Legacy PGI Compilers	10	6794	March 11, 2013
nested openmp support in pgf90 Legacy PGI Compilers	1	3725	December 21, 2012
OpenMP to PGI Accelerator Legacy PGI Compilers	1	2535	February 24, 2011

PGI not vectorizing openmp loops

Related topics