I have this code example that triggers an info message that I do not understand. This is a reproducer for a similar message I am trying to understand from production code. Here is the code (acc-function-struct-test.c):
#include <stdio.h>
#include <stdlib.h>
typedef struct {
int x[2][10];
int y[2][10];
int s[2][10];
} some_settings;
#pragma acc routine seq
static int lookup(int n, int x[], int y[], int s[]) {
int p;
for (p=0; p<n; ++p) {
if (x[p] > y[p]) break;
}
return x[p] + y[p] + s[p];
}
void foo(int n, some_settings * settings, int outp[]) {
#pragma acc parallel loop present(settings[0:1], outp[0:n])
for (int i=0; i<n; ++i) {
outp[i] = lookup(i, &settings->x[0][0], &settings->y[0][0], &settings->s[0][0]);
}
}
int main() {
some_settings * settings = malloc(sizeof(some_settings));
for (int i=0; i<10; ++i) {
settings->x[0][i] = i;
settings->x[1][i] = 100+i;
settings->y[0][i] = 20+i;
settings->y[1][i] = 200+i;
settings->s[0][i] = 40+i;
settings->s[1][i] = 400+i;
}
#pragma acc enter data copyin(settings[0:1])
int outp[10] = {0};
#pragma acc enter data copyin(outp[0:10])
foo(10, settings, outp);
#pragma acc exit data copyout(outp[0:10])
for (int i=0; i<10; ++i) {
printf("outp[%d] is %d\n", i, outp[i]);
}
#pragma acc exit data delete(settings[0:1])
}
If I compile this with optimizations (both 24.3 and 24.5) I’m getting this output:
$ nvc -Minfo=all -O3 -acc acc-function-struct-test.c -o acc-function-struct-test
foo:
19, Generating present(settings[:1],outp[:n])
Generating implicit firstprivate(i,n)
Generating NVIDIA GPU code
21, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
22, lookup inlined, size=8 (inline) file acc-function-struct-test.c (11)
13, Accelerator restriction: induction variable live-out from loop: .inl_p_5
Loop not vectorized/parallelized: potential early exits
15, Accelerator restriction: induction variable live-out from loop: .inl_p_5
main:
28, Loop not fused: function call before adjacent loop
Generated vector simd code for the loop
37, Generating enter data copyin(settings[:1])
40, Generating exit data copyout(outp[:])
Generating enter data copyin(outp[:])
42, Loop not vectorized/parallelized: contains call
46, Generating exit data delete(settings[:1])
The inlining of the lookup
function and it’s local variable p into the accelerator region in foo
seems to trigger Accelerator restriction: induction variable live-out from loop: .inl_p_5
. The code seems to be working correctly, but usually this is a serious message that should be addressed. Can I ignore it in this case? What is this message trying to tell me? With -O0
no inlining happens and the message is gone.