Incorrect source file in ACC callback

I am trying out the OpenACC profiling layer, see e.g. https://www.openacc.org/sites/default/files/inline-images/Specification/OpenACC.3.0.pdf. I am using pgcc from the HPC SDK version 22.9. In this example code

#include <stdio.h>
#include "foo.h"

int main (int argc, char *argv[]) {
   int N = 10000;
   int a[N], b[N], c[N];

#pragma acc parallel loop async(1)
   for (int i = 0; i < N; i++) {
      a[i] = i;
   }

#pragma acc parallel loop async(2)
   for (int i = 0; i < N; i++) {
      b[i] = 2 * i;
   }

#pragma acc wait(1) async(2)
#pragma acc parallel loop async(2)
  for (int i = 0; i < N; i++) {
     c[i] = a[i] + b[i];
  }

#pragma update self(c[0:N]) async(2)
#pragma acc wait (1,2)
  printf ("c[0]: %d\n", c[0]);
  printf ("c[N-1]: %d\n", c[N-1]);
}

with foo.h

void foo (int *x, int N) {
#pragma acc parallel loop
   for (int i = 0; i < N; i++) {
      x[i]++;
   }
}  

I want to trace the wait pragma using

#include <stdio.h>
#include "acc_prof.h"
void wait_start (acc_prof_info *prof_info, acc_event_info *event_info, acc_api_info *api_info) {
   printf ("Wait start: %s %d %d\n", prof_info->src_file, prof_info->line_no, prof_info->end_line_no);
}

void acc_register_library (acc_prof_reg register_ev, acc_prof_reg unregister_ev, acc_prof_lookup_func lookup) {
   register_ev (acc_ev_wait_start, wait_start, 0);
}

which is compiled into a shared library and set to LD_PRELOAD. I get the following output:

Wait start: /home/cweiss/tmp/test_openacc_async_wait/foo.h 28 28
Wait start: /home/cweiss/tmp/test_openacc_async_wait/foo.h 28 28

The source file attributed to the wait region is not the main file, but instead the included foo.h. Note that the function implemeted in foo.h is not used at all. Moreover, if I remove the pragma from the function foo, the output shows the main source file.
I guess this is a bug, or do I understand something wrong? The documentation is not very clear about what the prof_info.src_file variable precisely indicates.

Thanks for the report Christian and nice example. I was able to recreate the issue here and have filed a problem report, TPR #32683.

Looks like the issue only occurs when the header constrains “foo”'s definition. If you change it to only include the prototype and then put the definition in a separate source file, then it works as expected.

-Mat

Hi Christian,

Apologies for the late notification. TPR #32683 was fixed in our 23.1 release.

% nvc -acc test.c -fast trace.c -V23.1 ; a.out
test.c:
trace.c:
Wait start: <path>/fs32714/test.c 26 26
Wait start: <path>/fs32714/test.c 26 26
c[0]: 0
c[N-1]: 29997

-Mat