Hello everyone,

I have the following problem. I have a main subroutine, let us call it `main_function`

(for 3D BSplines). It takes as input several tensors.

This function contains only IF-conditions. If a condition is satisfied, other functions are called. Let us call these functions: `function_a`

, `function_b`

, and `function_c`

which are parallelizable.

The structure is as follows

```
subroutine main_function(paras)
if(1) then
call function_a
else if (2)
call function_b
else if (3)
call function_c
end if
end subroutine main_function
```

with

```
subroutine function_a(paras)
!$acc parallel loop present(....)
do
heavy parallel calcs
end do
output: eta
end subroutine function_a
subroutine function_b(paras)
!$acc parallel loop present(....)
do
heavy parallel calcs
end do
output: eta
end subroutine function_b
subroutine function_c(paras)
!$acc parallel loop present(....)
do
heavy parallel calcs
end do
output: eta
end subroutine function_c
```

The subroutines `function_a`

, `function_b`

, and `function_c`

have a B-spline tensor (`eta`

) as an output calculated on GPU. I donât want to move this tensor to the host since it is not needed there. However, after calculating `eta`

on GPU using `main_function`

, an interpolation subroutine `interpolate3D`

is called to interpolate the function. The definition of `interpolate3D`

is something like

```
subroutine interpolate3D(eta, x, y, z, fAtxyz)
!$acc routine seq
interpolate ...
end subroutine interpolate3D
```

To summarize the the pseudo-code is something like

```
call main_function(paras)
!$acc parallel loop present(x, y, eta, fAtxyz)
do i = 1, N
call interpolate3D(eta, x(i), y(i), z(i), fAtxyz(i))
end do
```

My problems and questions are:

1)- When I donât use â`!$acc update self (eta)`

â before the loop, the results are completely wrong. Does this mean that â`present clause`

â doesnât find correctly `eta`

, calculated by `main_function`

, on GPU. Therefore, one needs to update the host, and then recopy it back to the GPU?

2)- How to ensure that `interpolate3D`

is working on GPU? For example, if I donât have the above loop, does only adding â`!$acc routine seq`

â ensure that it works on GPU and searches for different quantities there?

3)- In fact, when there is no loop, adding â`!$acc update self (eta)`

â is required to have correct results. Does this mean that in this case the subroutine is executed on CPU?

3)- To summarize, If I have two subroutines: the first choses between different subroutines based on if-conditions to calculate a vector or tensor and keep it on GPU (I donât want to update the host), while the second will use this vector to perform some calculations on GPU, how to do this correctly with `openACC`

?

I attached a very simple example concerning the questions above. The subroutine `calc_etaVec`

calculates `eta`

on GPU, while the subroutine `calcFunAtx`

interpolates at the position `xp`

using `eta`

(Nearest-neighbor interpolation). I would like to know if possible how to allow `calcFunAtx`

to work directly with GPU data? Moreover, comments, correction or/and advice concerning the implementations are very welcome

program.f90 (1.9 KB)

Sorry for being long and thank you very much for your help,