"enter data" ignored in switch statement

LSCH · October 27, 2015, 10:33pm

Hi,

I have some trouble using the “enter data” directive and I am hoping for some clarification.

The following code defines a simple struct containing a pointer to be used for a dynamic array:

struct my_struct {
  float* x;
};

I would like to use a function with a static array of this struct as argument to copy data from the host to the device:

void copy2device(my_struct s[6]){// or void copy2device(my_struct s[])
#pragma acc enter data copyin(s[0].x[0:10])
#pragma acc enter data copyin(s[1].x[0:4])
#pragma acc enter data copyin(s[2].x[5:12])
}

pgc++ 15.7 generates the following output:

copy2device(my_struct *):
  56, Accelerator clause: upper bound for dimension 0 of array 's' is unknown
         Generating enter data copyin(s[:1],s->x[:10],s->x[:4],s->x[5:12])

So it seems that the compiler doesn’t know the length of array s, fair enough, and decides to only copy the struct contained in the first element of the array (s[:1]). What do the following statements, i.e. s->x[:10],s->x[:4],s->x[5:12], mean in this context? It would only make sense to allocate memory for x of the struct in the first array element, but it only mention s-> and no index for s. So there are three similar statements for allocating memory for x, but which one will be used in the end?
If I put:

#pragma acc enter data copyin(s[0:6])

The compiler no longer complains about an unknown upper bound for s:

copy2device(my_struct *):
     56, Generating enter data copyin(s[:6],s->x[:10],s->x[:4],s->x[5:12])

But does it now allocate different amounts of memory for x of my_struct in the various elements of the array s?
To further complicate things I put the “enter data” into a switch statement. Some control structure might be needed to decide on when to copy what to the device:

void copy2device(my_struct s[6], int t){
#ifdef _OPENACC
#pragma acc enter data copyin(s[0:6])
  switch(t){
  case 0:
#pragma acc enter data copyin(s[0].x[0:10])
    break;
  case 1:
#pragma acc enter data copyin(s[1].x[0:4])
    break;
  case 2:
#pragma acc enter data copyin(s[2].x[5:12])
    break;
  default:
    break;
  }
#endif
}

In this case the compiler seems to ignore all “enter data” statements within the switch statement. Am I doing anything illegal wrt the OpenACC standard?

copy2device(my_struct *, int):
     29, Generating enter data copyin(s[:6])

Putting “enter data copyin(s)” within the switch statement, the compiler does not report anything, i.e. it is not generating any data copying.

I am a bit confused about this behavior and would be thankful for any kind of clarification.

Thanks,
LS

MatColgrove · October 29, 2015, 6:49pm

Hi LS,

The struct array does need to be copied (or at least created) on the GPU before you can start copying over each elements data members. The “copyin(s[0:6])” is required before “copyin(s[0].x[0:10])” so compiler can then “attach” the “x” array. (i.e. set the S[0].x device pointer).

Although the generated code looks correct, the compiler message “copyin(s->x[:10])” is incorrect and should be "copyin(s[0]->x[:10]). I’ve added TPR#22043 to have this fixed.

For the switch statement, it does appear that pgc++ is removing empty cases even though there’s an OpenACC pragma there. I’ve added TPR#22045 and sent it to engineering. pgcc seems fine, so the work around is to use pgcc or add in some host code so pgc++ doesn’t remove the case.

Thanks!
Mat

% cat test.c
struct my_struct {
  float* x;
};

void copy2device(struct my_struct s[6], int t){
#ifdef _OPENACC
#pragma acc enter data copyin(s[0:6])
  switch(t){
  case 0:
#pragma acc enter data copyin(s[0].x[0:10])
    t=0;
    break;
  case 1:
#pragma acc enter data copyin(s[1].x[0:4])
    t=1;
    break;
  case 2:
#pragma acc enter data copyin(s[2].x[5:12])
    t=2;
    break;
  default:
    break;
  }
#endif
}
% pgc++ -c test.c -Minfo=accel -acc
copy2device(my_struct *, int):
      8, Generating enter data copyin(s[:6])
     11, Generating enter data copyin(s->x[:10])
     15, Generating enter data copyin(s->x[:4])
     19, Generating enter data copyin(s->x[5:12])

LSCH · October 29, 2015, 10:26pm

Hi Mat

thanks for the clarification. I am glad the generated code looks OK and it is just the generated message that is not accurate.

In the first code example I had forgotten to list the initial #pragma acc enter data copying(s[0:6]), but in the sample code that I compiled and which generated the quoted output it was present. Sorry for the confusion, but I had read about the necessity of it in the official PGI documentation.

Thanks,
Lutz
PS: In an earlier post you mentioned that you wanted to switch over to a ticket tracking system that is open to the public or PGI subscribers, but you couldn’t tell when at the time. Do you have any update on this?

MatColgrove · October 30, 2015, 4:28pm

PS: In an earlier post you mentioned that you wanted to switch over to a ticket tracking system that is open to the public or PGI subscribers, but you couldn’t tell when at the time. Do you have any update on this?

Our initial investigation into using NVIDIA’s external NVbugs reporting system didn’t pan out, so we’re still thinking about how to do this. No timeline or even if it will happen, but it is something we want to do and are trying to find a solution.

Mat

LSCH · November 2, 2015, 2:04pm

Hi Mat

Thanks for the update.

Regarding the workaround you suggested for the switch statement, it seems that it only works if one adds a statement AFTER the enter data directive. At least the compiler does not report any “Generating enter data create …” for the arrays of the struct. So the problem seems to be not only for empty switch cases. A detail you may want to add to TPR#22043.

Thanks,
LS

MatColgrove · November 2, 2015, 6:55pm

Yes, it looks like the pragma may need to be put before an executable statement. I noted this in the TPR.

LSCH · November 12, 2015, 8:48am

Hi Mat

another workaround seems to be embedding the body of the case statement into curly braces:

  case 0:
   {
#pragma acc enter data copyin(s[0].x[0:10])
    }
    break;
  case 1: 
...

So maybe this issues has something to do with scoping. At least this is what I noticed on another occasion.

Thanks,
LS

brentl · April 2, 2016, 3:18am

This issue was addressed in PGI 16.1 and above. Let us know if you find otherwise.