Hi,

I’m trying to optimize the following code:

``````!\$OMP TARGET TEAMS LOOP BIND(TEAMS)
do e=1,nelt
!\$OMP LOOP COLLAPSE(3) BIND(PARALLEL) PRIVATE(tmpu3,l)
do k=1,lz1
do j=1,ly1
do i=1,lx1
tmpu3 = 0.0
do l=1,lx1
tmpu3 = tmpu3 + dxm1(k,l)*u(i,j,l,e)
enddo

wr = g5m1(i,j,k,e)*tmpu3
ws = g6m1(i,j,k,e)*tmpu3
wt = g3m1(i,j,k,e)*tmpu3

dudr(i,j,k,e) = (dudr(i,j,k,e) + wr) *  helm1(i,j,k,e)
duds(i,j,k,e) = (duds(i,j,k,e) + ws) *  helm1(i,j,k,e)
dudt(i,j,k,e) = (dudt(i,j,k,e) + wt) *  helm1(i,j,k,e)

enddo
enddo
enddo
enddo
``````

The initialization of tmpu3 inhibit the collapsing of 4 loops togheter. So my idea is the following:

``````  tmpu3 = 0.0

!\$OMP TARGET TEAMS LOOP BIND(TEAMS)
do e=1,nelt
!\$OMP LOOP COLLAPSE(4) BIND(PARALLEL) FIRSTPRIVATE(tmpu3,l)
do k=1,lz1
do j=1,ly1
do i=1,lx1
do l=1,lx1
tmpu3 = tmpu3 + dxm1(k,l)*u(i,j,l,e)
enddo

wr = g5m1(i,j,k,e)*tmpu3
ws = g6m1(i,j,k,e)*tmpu3
wt = g3m1(i,j,k,e)*tmpu3

dudr(i,j,k,e) = (dudr(i,j,k,e) + wr) *  helm1(i,j,k,e)
duds(i,j,k,e) = (duds(i,j,k,e) + ws) *  helm1(i,j,k,e)
dudt(i,j,k,e) = (dudt(i,j,k,e) + wt) *  helm1(i,j,k,e)

tmpu3 = 0.0
enddo
enddo
enddo
enddo
``````

Having:

NVFORTRAN-S-0533-Clause ‘FIRSTPRIVATE’ not allowed in OMP LOOP

Why FIRSTPRIVATE is not allowed Is there other way to collapsing all 4 loops toghether? Thanks.

Are you sure you want to collapse the 4 loops together? What is lx1 typically? Unless it is very large (> 64?) I would think you want to run the “do l” loop sequentially by every thread. You will get better access of the u array.

Hi, these are the loop dimensions:

nelt: 9120
lz1: 8
ly1: 8
lx1: 8

And apart the convenience or not, why FIRSTPRIVATE is not allowed?

I’ll have to dig into it. I am confused what your intended behavior is, and maybe the compiler is confused as well. If you collapse all 4 loops, do you want to do a reduction on tmpu3? But, there are only 3 "end do"s. What does firstprivate in such a structure even mean?