Hi,
below code run on a GTX480 with CC30 results in an upredictable rewriting of values from copiedInArray from host memory to an local temporary array only on GPU.
The arrays are real*4 and have the same dimensions x=90, y=90, z=1500 (probably the z dimension is the matter here)
!$acc region
do k=2,z
do j=1,y
do i=1,x
localGPUArray(i,j,k) = copiedInArray(i,j,k)
enddo
end do
end do
!$acc end region
It appears that the compiler divides the job in a weird matter between computation units on GPU (90x90x1499).
A fast fix to this problem, so that values in both arrays are the same on the same indexes was to make any of these loops sequential. However the compiler nor profiler have not shown any hint that without the !$acc do seq these calculations may work undesired.
!$acc region
do k=2,z
do j=1,y
!$acc do seq
do i=1,x
localArray(i,j,k) = copiedInArray(i,j,k)
enddo
end do
end do
!$acc end region
If You know any better way to fill an local GPU array with host-uploaded data please let me know. I hope that You will be able to recreate this problem and address it with a fix :)
Regards,
Nicolas Dobski