OpenACC nested c structs

Hi,
I have run into a problem with OpenACC with pgi/12.10.0, it does not seem to allow c structs which contain more than one struct member. For example the following code will not parallelize, giving an invalid loop error in the -Minfo output. If I however only have one double3 struct inside the st struct it seems to parallelize. Is this something that has been fixed in a newer version?

Thanks,
Adam

typedef struct ST st;
typedef struct TUPLE double3;

#include <stdio.h>
#include <stdlib.h>

struct TUPLE {
    double x;
    double y;
    double z;
};

struct ST {
    double3 pos;
    double3 vel;
};

int main(int argc, char *argv[])
{
    st *mystructs = (st*) malloc(sizeof(st)*1000);

    int i;
    st tmpST;
    #pragma acc kernels copy(mystructs[0:1000])
    for(i=0; i<1000; i++){
        tmpST = mystructs[i];
    }

    return 0;
}

Hi Adam,

The problem is with the implicit deep copy between the two structs. If you change this to an explicit copy, then your should be fine.

% cat struct.c
typedef struct ST st;
typedef struct TUPLE double3;

#include <stdio.h>
#include <stdlib.h>

struct TUPLE {
    double x;
    double y;
    double z;
};

struct ST {
    double3 pos;
    double3 vel;
};

int main(int argc, char *argv[])
{
    st *mystructs = (st*) malloc(sizeof(st)*1000);

    int i;
    st tmpST;
    #pragma acc kernels copy(mystructs[0:1000])
    for(i=0; i<1000; i++){
        tmpST.pos.x = mystructs[i].pos.x;
        tmpST.pos.y = mystructs[i].pos.y;
        tmpST.pos.z = mystructs[i].pos.z;
        tmpST.vel.x = mystructs[i].vel.x;
        tmpST.vel.y = mystructs[i].vel.y;
        tmpST.vel.z = mystructs[i].vel.z;
    }

    return 0;
}
% pgcc -acc -Minfo=accel struct.c
main:
     24, Generating present_or_copy(mystructs[0:1000])
         Generating NVIDIA code
         Generating compute capability 1.0 binary
         Generating compute capability 2.0 binary
         Generating compute capability 3.0 binary
     25, Loop is parallelizable
         Accelerator kernel generated
         25, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */

Hope this helps,
Mat

Thanks Mat, that works.

Mat,
Just so i’m clear, is this considered to be a bug that may be fixed in the future?

No, it’s more of a limitation. Though, I put in a feature request (TPR#19218) and we’ll see what can be done. Your case is simpler then others I’ve seen so may be possible.

  • Mat

TPR 19218 - OpenACC: request for support of C struct copy assignments

has been corrected in the current 15.10 release.

thanks,
dave