Few operations on the same variable

Hello,

I have a piece of test code like that:

	  do i=1,10
	    do j=1,10
		vvect(i,j)=i
	    enddo
	  enddo

!$acc region do local(ijk)
	      do ijk=1,10
		  vvect(ijk,2)=vvect(ijk,1)+2
		  vvect(ijk,2)=vvect(ijk,2)+2
	      enddo  
!$acc end region

        do ijk=1,10
	  write(*,*) vvect(ijk,2)
	enddo

And I don’t know why the result is:

3.0
4.0
5.0
6.0
7.0
8.0
9.0
10.0
11.0
12.0

instead of:

5.0
6.0
7.0

Further, when I add a copy() directive:

!$acc region do local(ijk) copy(vvect)

The results are:

1.0
2.0
3.0

So without any change…

What’s wrong?

Additional problem. I have a loop like that:

!$acc region do
do ijk=imoj4,imoj5

              uhalfp=-dudp(ijk)*(vvect(ipjk,igfy)-vvect(ijk,igfy))

              vvect(ijk,igfyp1)=1.0

          enddo
!$acc end region

After executing this piece of code all values in vvect(:,igfyp1) are zero. Why is so? If I comment out the line with uhalfp variable, all values in vvect(:,igfyp1) are correctly 1.0.

Hi szczelba,

For your first post, I wrote the following example:

     program testme
   
     implicit none
     integer :: i,j,ijk
     real, dimension(10,10) :: vvect

     do i=1,10
       do j=1,10
      vvect(i,j)=i
       enddo
     enddo

!$acc region copy(vvect) 
         do ijk=1,10
        vvect(ijk,2)=vvect(ijk,1)+2
        vvect(ijk,2)=vvect(ijk,2)+2
         enddo 
!$acc end region
        do ijk=1,10
     write(*,*) vvect(ijk,2)
   enddo

   end program testme

I was not able to reproduce the initial error (i.e. without copy(vvect) clause) but did see the second error in version 10.8. Version 10.9 does seem to get the correct answers.

% pgf90 test.f90 -ta=nvidia -Minfo=accel -V10.8; a.out
testme:
     14, Generating copy(vvect(:,:))
         Generating compute capability 1.0 binary
         Generating compute capability 1.3 binary
     15, Loop is parallelizable
         Accelerator kernel generated
         15, !$acc do parallel, vector(10)
             Cached references to size [10x2] block of 'vvect'
             CC 1.0 : 3 registers; 104 shared, 20 constant, 0 local memory bytes; 33 occupancy
             CC 1.3 : 3 registers; 104 shared, 20 constant, 0 local memory bytes; 25 occupancy
    1.000000    
    2.000000    
    3.000000    
    4.000000    
    5.000000    
    6.000000    
    7.000000    
    8.000000    
    9.000000    
    10.00000    
% pgf90 test.f90 -ta=nvidia -Minfo=accel -V10.9 ; a.out
testme:
     14, Generating copy(vvect(:,:))
         Generating compute capability 1.0 binary
         Generating compute capability 1.3 binary
     15, Loop is parallelizable
         Accelerator kernel generated
         15, !$acc do parallel, vector(10)
             Cached references to size [10x2] block of 'vvect'
             CC 1.0 : 5 registers; 104 shared, 20 constant, 0 local memory bytes; 33 occupancy
             CC 1.3 : 5 registers; 104 shared, 20 constant, 0 local memory bytes; 25 occupancy
    5.000000    
    6.000000    
    7.000000    
    8.000000    
    9.000000    
    10.00000    
    11.00000    
    12.00000    
    13.00000    
    14.00000

For a second post, do you mind writing-up a small reproducing example?

Thanks,
Mat

I’ve upgraded to PGI 10.9 and it really solve the first problem. Thanks Mat. Although, it was really strange, that it didn’t work in 10.8.

According to my second post - the case is really weird. When I tried to execute it again, the results were ok. Sometimes I have a feeling that some changes in code does not have any impact on the result. For example, if I at first run code with loop like in the second post I get only zeros. And then, after commenting out the first line in the loop I still get zeros. But another day I at first run code without this line and get 1.0s. The result is ok even after again adding this line… For me it looks a bit undeterministic.
I’ll try to investigate and give you may results.

Regards