OpenACC "declare link" with routine called in target region

venetis · September 4, 2020, 3:01pm

Hello,

I have a larger program that I am trying to convert so that the
computationally intensive part will run on an NVidia GPU using OpenACC.
However, I am running into trouble when running the program, as I am not getting the expected results. The part of the program to run on the GPU contains calls to subroutines, where
variables declared in a separate module are used. This seems to be creating issues. I have reduced the problem to the attached files test_link.zip (1.1 KB)

Please ignore the OpenMP directives. I tried to do the same thing with gcc 10.2 using OpenMP, but had other issues (and have sent a corresponding message to their mailing list).

I use Community Edition 19.10 and compile with:
pgfortran -O4 -acc -ta=tesla,cc35 -Minfo=all,mp,accel -Mcuda=cuda10.0 test_link.f90 common_vars.f90 parameters.f90 -o test_link

When I run the program I get for all elements in array IS2 the value 0, whereas I was expecting 24.

If I take out (or comment out) the subroutine TEST(), the corresponding interface and its call the result is indeed correct for all arrays.

Any hints on what could be going wrong?

Ioannis

venetis · September 9, 2020, 8:34am

Hello,

Just a note that I managed to solve the problem. Here is the solution, in case someone else faces the same issue:

Instead of the “!$ACC DECLARE TARGET LINK(NR)” directive in common_vars.f90 I used “!$ACC DECLARE TARGET TO(NR)”. In addition, before entering the region to be executed on the GPU, I added the directive “!$ACC UPDATE DEVICE(NR)” to update the value of NR on the device with the latest value from the host (in test_link.f90).

Ioannis

MatColgrove · September 10, 2020, 3:43pm

Make sense. Using a declare directive on a module variable will cause the device copy to be created when the device is initialized. You could have also solved this by using “declare copyin” instead of “declare link” so the variable gets initialized, but it would only work if the device is initialized after NR is assigned. Using the “update” directive is the better way to go since you can be sure the variable’s value on the device is set.