I have a larger program that I am trying to convert so that the
computationally intensive part will run on an NVidia GPU using OpenACC.
However, I am running into trouble when running the program, as I am not getting the expected results. The part of the program to run on the GPU contains calls to subroutines, where
variables declared in a separate module are used. This seems to be creating issues. I have reduced the problem to the attached files test_link.zip (1.1 KB)
Please ignore the OpenMP directives. I tried to do the same thing with gcc 10.2 using OpenMP, but had other issues (and have sent a corresponding message to their mailing list).
I use Community Edition 19.10 and compile with:
pgfortran -O4 -acc -ta=tesla,cc35 -Minfo=all,mp,accel -Mcuda=cuda10.0 test_link.f90 common_vars.f90 parameters.f90 -o test_link
When I run the program I get for all elements in array IS2 the value 0, whereas I was expecting 24.
If I take out (or comment out) the subroutine TEST(), the corresponding interface and its call the result is indeed correct for all arrays.
Any hints on what could be going wrong?