I recently spent few hours trying to figure out why my gpu code was running 20% slower than an equivalent older version. Eventually tracked it down to 2 new arrays which I had mistakenly not included in the data copy() statement. So could you remind me how to check if there are any arrays missing from data copy() - is there some compiler option to do this for me?
On the compute regions themselves, you can add “default(present)”. This sets the default to check if arrays are present on the device already (i.e. in you’re outer data region). If it’s not present, then it will error at runtime.
There’s also “default(none)” where if there’s an array that isn’t defined in a structured data region or a data clause on the compute region, then you get a compile time error.
perfect - thanks…
so just to clarify, by adding default(none) like below, this will tell me (at runtime) if there are any arrays used inside the parallel region that have NOT been included in the data copy() statement?
!$acc kernels loop independent default(none)
do k = 1,nk
!$acc loop independent
do j = 1,nj
!$acc loop independent
do i = 1,ni
With “default(none)” the error would occur at compile time. “default(present)” would be runtime.
thats very useful to know - thanks!