NVC++-W-0155-External and Static variables are not supported in acc routine, and break problem

My question has two parts,
the first part is:
I have code and it has a static variable in it but when I tried to run the code using
#pragma acc kernels
and
#pragma acc parallel loop
both give me this error,
NVC++ -W-0155-External and Static variables are not supported in acc routine
what should I use instead of static, I am using static because I want to increase the variable that I have by each time I am calling the function and I don’t want to use a loop inside the function, is there a way to have Static variable?

the second part is:
I took this code from an online site and I tried to run it using openACC kernel pragma to see and understand the parallelization process

#include <iostream>
using namespace std;
void PrintDiv(int a, int b, int n)
{
    cout << a << "/ " << b << "=";
    if (b == 0)
    {
        //if divide by 0
        cout << (a >= 0 ? "+INF" : "-INF") << endl;
        return;
    }
    if (a == 0)
    {
        cout << 0 << endl;
        return;
    }
    if (n <= 0)
    {
        //just the integer part
        cout << a / b << endl;
        return;
    }
    if (((a > 0) && (b < 0)) || ((a < 0) && (b > 0)))
    {
        //check the sign
        cout << "-";
        a = a > 0 ? a : -a;
        b = b > 0 ? b : -b;
    }
    int c = a / b;
    #pragma acc kernels
    for (int i = 0; i < n; i++) // iterated
    {
        printf("%d" , c);
        //cout << c;
        a -= b * c;
        if (a == 0)
        break; // full division no need to continue
        a *= 10;
        c = a / b;
        if (i == 0)
            //cout << ".";
        printf(".");
    }
    printf("\n");
    //cout << endl;
}
int main()
{
    cout << "Please give me three integers: " << endl;
    int a, b, n;

    #pragma acc kernels
    do
    {
        scanf("%d %d %d" , &a,&b,&n);
        //cin >> a >> b >> n;
        PrintDiv(a, b, n);
    } while (n >= 0);
    scanf("%d", &n);
    //cin >> n;
}

when I run it using pgc++ -fast -acc -ta=tesla -Minfo=accel -o output test.cpp

two errors appear two me: the first one is:

line 38: error: branching into or out of a parallel region is not
allowed
break; // full division no need to continue

when I remove break another error appears to me:

NVC+±S-0155-Unsupported nested compute construct in compute construct or acc routine (serial.cpp: 30)
NVC+±W-0155-Accelerator region ignored; see -Minfo messages (serial.cpp)
PrintDiv(int, int, int):
0, Accelerator region ignored
30, Accelerator restriction: invalid loop
NVC++/x86-64 Linux 20.7-0: compilation completed with severe errors

I actually don’t understand why all these issues appear to me and I already tried other examples and they work just fine with me.
How I can overcome these issues.

Thank you in advance

I’d want to understand the specific case, but in general global static variables need to be added to a “declare” directive. “declare” creates a data region matching the scope and lifetime as the scoping unit in which the variable is defined.

For example:

#include <iostream>
#include <cstdlib>

static int counter;
#pragma acc declare create(counter)

void addCount(int val) {
#pragma acc atomic update
counter += val;
}

int main() {

    counter = 0;
#pragma acc update device(counter)
#pragma acc parallel loop
    for (int i=0; i<1024;++i) {
        addCount(1);
    }
#pragma acc update host(counter)
    std::cout << "Count: " << counter << std::endl;
    exit(0);
}

Note the use of the “update” directive to synchronize the host and device copies of the “counter” variable. Also, the use of the “atomic” directive is necessary here to avoid a race condition.

line 38: error: branching into or out of a parallel region is not allowed

The error is correct in that it is illegal in OpenACC to jump out of a compute region. The problem being that if this would cause a loop dependency. All iterations of the loop could be executed concurrently, but with the “break” prevents this since later iterations can’t know if they are to be executed or not. The only way to do this is to execute all preceding iterations (i.e. run sequentially).

Note that your loop also contains a race condition on the “a” and “c” variables so even if you change the “break” to be an “if” statement to fix the earlier error, and force parallelization (i.e. use “parallel loop” instead of “kernels”), you’ll get incorrect answers. Unfortunately using atomics wont help here since you’re both updating and using the variables in the loop. You may still be able to offload the loop to the device by using a “serial loop”, but the loop is not parallelizable.

NVC+±S-0155-Unsupported nested compute construct in compute construct or acc routine

While the OpenACC standard allows for nested compute regions, we don’t support this feature. We’ve yet to find a compelling use case for dynamic parallelism (i.e. launching kernels for within another kernel) so haven’t added it.

30, Accelerator restriction: invalid loop

This is the “do-while” loop. Parallel loops must be countable (i.e. the number of iterations of the loop is known before the loop is offloaded).

scanf("%d %d %d" , &a,&b,&n);

“scanf” can’t be used within device code since the device does not have access to the host’s I/O.

To successfully offload this code and get correct results, you’d need to do something like the following. Performance will be poor since it can’t be run in parallel.

% cat test.cpp
#include <iostream>
#include <cstdlib>
#include <stdio.h>
using namespace std;
void PrintDiv(int a, int b, int n)
{
    cout << a << "/ " << b << "=";
    if (b == 0)
    {
        //if divide by 0
        cout << (a >= 0 ? "+INF" : "-INF") << endl;
        return;
    }
    if (a == 0)
    {
        cout << 0 << endl;
        return;
    }
    if (n <= 0)
    {
        //just the integer part
        cout << a / b << endl;
        return;
    }
    if (((a > 0) && (b < 0)) || ((a < 0) && (b > 0)))
    {
        //check the sign
        cout << "-";
        a = a > 0 ? a : -a;
        b = b > 0 ? b : -b;
    }
    int c = a / b;
    #pragma acc serial loop
    for (int i = 0; i < n; i++) // iterated
    {
        printf("%d" , c);
        a -= b * c;
        if (a != 0) {
           a *= 10;
           c = a / b;
        }
        if (i == 0)
           printf(".");
    }
    printf("\n");
    //cout << endl;
}
int main()
{
    cout << "Please give me three integers: " << endl;
    int a, b, n;

    do
    {
        scanf("%d %d %d" , &a,&b,&n);
        //cin >> a >> b >> n;
        PrintDiv(a, b, n);
    } while (n >= 0);
    scanf("%d", &n);
    //cin >> n;
}
% nvc++ test.cpp -Minfo=accel -acc
PrintDiv(int, int, int):
     32, Accelerator serial kernel generated
         Generating Tesla code
         34, #pragma acc for seq
% a.out
Please give me three integers:
3456 9876 1024
3456/ 9876=0.349939246658566221142162818955042527339003645200486026731470230862697448359659781287970838396111786148238153098420413122721749696233292831105710814094775212636695018226002430133657351154313487241798298906439854191980558930741190765492102065613608748481166464155528554070473876063183475091130012150668286755771567436208991494532199270959902794653705953827460510328068043742405832320777642770352369380315917375455650060753341433778857837181044957472660996354799513973268529769137302551640340218712029161603888213851761846901579586877278250303766707168894289185905224787363304981773997569866342648845686512758201701093560145808019441069258809234507897934386391251518833535844471445929526123936816524908869987849331713244228432563791008505467800729040097205346294046172539489671931956257594167679222357229647630619684082624544349939246658566221142162818955042527339003645200486026731470230862697448359659781287970838396111786148238153098420413122721749696233292831105710814094775212636695018226002430133657351154313487241798298