Does the number of class hierarchy levels matter?

Hello,

My C++ and OpenACC code has a deep hierarchy of classes, where the top-most class has pointer type data members that point to lower-level class object, which further contains pointers to even lower-level class objects. Here is an example:

#include <iostream>

class layer3{
        public:
        void createDeviceData()
        {
          std::cout << "layer3_create" << std::endl;
          #pragma acc enter data copyin(this)
        }
        protected:
        char a3;
};

class layer2{
        public:
        void createDeviceData()
        {
          std::cout << "layer2_create" << std::endl;
          #pragma acc enter data copyin(this)
          ptr->createDeviceData();
        }
        protected:
        char a2;
        layer3* ptr;
};

class layer1{
        public:
        void createDeviceData()
        {
          std::cout << "layer1_create" << std::endl;
          #pragma acc enter data copyin(this)
          ptr->createDeviceData();
        }
        protected:
        char a1;
        layer2* ptr;
};


int main()
{
        layer1* obj = new layer1;
        std::cout << "before create" << std::endl;
        obj->createDeviceData();
        std::cout << "after create" << std::endl;

}

The code compiles well with “pgc++ -acc -ta=nvidia -Minfo=accel -g test.C”, but crashes during execution:

before create
layer1_create
upload CUDA data  file=path_to_test.C function=_ZN6layer116createDeviceDataEv line=33 device=0 threadid=1 variable=_T37153480_2815 bytes=16
layer2_create
Segmentation fault (core dumped)

However, if I reduce the hierarchy level to 2, such as the one below, the code runs well.

#include <iostream>

class layer2{
        public:
        void createDeviceData()
        {
          std::cout << "layer2_create" << std::endl;
          #pragma acc enter data copyin(this)
        }
        protected:
        char a2;
};

class layer1{
        public:
        void createDeviceData()
        {
          std::cout << "layer1_create" << std::endl;
          #pragma acc enter data copyin(this)
          ptr->createDeviceData();
        }
        protected:
        char a1;
        layer2* ptr;
};


int main()
{
        layer1* obj = new layer1;
        std::cout << "before create" << std::endl;
        obj->createDeviceData();
        std::cout << "after create" << std::endl;

}

and the output is:

before create
layer1_create
upload CUDA data  file=path_to_test2.C function=_ZN6layer116createDeviceDataEv line=20 device=0 threadid=1 variable=_T27670088_2811 bytes=16
layer2_create
after create

Does it mean that the number of class hierarchy levels matter?

Thanks,
Shine

Hi Shine,

Try running the example without OpenACC and you’ll see the same seg fault. The problem being that “ptr” isn’t allocated so when you deference it, the code gets a seg fault. It’s unclear why the second example works, it should seg fault as well, but it could be luck that the initial value of ptr happens to be a valid address.

Besides allocating ptr, I’d also recommend using the “enter data attach” directive so the device address for ptr gets attached (i.e. the device copy of the class’ ptr gets filled in with the device address).

% cat test.cpp
#include <iostream>

class layer3{
        public:
        void createDeviceData()
        {
          std::cout << "layer3_create" << std::endl;
          #pragma acc enter data copyin(this)
        }
        protected:
        char a3;
};

class layer2{
        public:
        void createDeviceData()
        {
          std::cout << "layer2_create" << std::endl;
          #pragma acc enter data copyin(this)
          ptr=new layer3;
          ptr->createDeviceData();
          #pragma acc enter data attach(ptr)
        }
        protected:
        char a2;
        layer3* ptr;
};

class layer1{
        public:
        void createDeviceData()
        {
          std::cout << "layer1_create" << std::endl;
          #pragma acc enter data copyin(this)
          ptr=new layer2;
          ptr->createDeviceData();
          #pragma acc enter data attach(ptr)
        }
        protected:
        char a1;
        layer2* ptr;
};


int main()
{
        layer1* obj = new layer1;
        std::cout << "before create" << std::endl;
        obj->createDeviceData();
        std::cout << "after create" << std::endl;

}
% pgc++ test.cpp -ta=tesla -Minfo=accel
layer3::createDeviceData():
      9, Generating enter data copyin(this[:1])
layer2::createDeviceData():
     20, Generating enter data copyin(this[:1])
     23, Generating enter data attach(ptr)
layer1::createDeviceData():
     35, Generating enter data copyin(this[:1])
     38, Generating enter data attach(ptr)
% a.out
before create
layer1_create
layer2_create
layer3_create
after create
%

Hope this helps,
Mat

Hi Mat,

Thanks for your clear explanations. I am sorry I made a stupid mistake…

Two follow-up questions:

  1. With “#pragma acc enter data attach(ptr)” and PGI_ACC_NOTIFY=3, I see that there are additional lines as follows:
upload CUDA data  file=path_to_code function=_ZN6layer216createDeviceDataEv line=23 device=0 threadid=1 variable=.attach. bytes=8

In my example posted, these lines do not show up if the attach operation is not enforced. However, my actual code did not enforce the attachment, but I still saw “upload CUDA data file=path_to_code function=function_name line=line_number device=0 threadid=1 variable=.attach. bytes=8”. So presumably sometimes the compiler may enforce the attachment? I will explicitly attach device pointer to device address in my future coding as a good practice.

  1. I just added “#pragma acc enter data attach(ptr)” in my actual code, for all the ‘layers’. Then during compilation, at layer 1, the following error arose:
PGCC-S-0000-Internal compiler error.  c++ specific feature not expected      14  (path_to_layer1_location_of #pragma acc enter data attach(ptr) )
PGCC-S-0000-Internal compiler error. mkexpr: bad id    -480  (path_to_layer1_location_of #pragma acc enter data attach(ptr) )

The error went away as I remove the attachment at layer1 while retaining attachment for all lower layers (memory issue still persisted during execution though). Is there a webpage that details these error codes (14 and -480), or could you briefly comment on them?

Thanks,
Shine

Hi Shine,

So presumably sometimes the compiler may enforce the attachment?

There is an implicit attach when the compiler knows that the variable is contained in aggregate type. Though if the device data is created outside of a context where it can determine the relationship, then the user needs to explicitly attach the variable.

For example:

#pragma acc enter data(this, ptr[0:n])

If “ptr” is a member of the class, then it will be implicitly attached.

However, if the device data is created separately, then the explicit attach is needed.

ptr = new double[n];
#pragma acc create(ptr[0:n])
classname foo(ptr);
...
// class constructor
classname(double * ptr) {
     myptr = ptr;
#pragma acc enter data copyin(this) attach(myptr)
}

In your layer example, you need to use the attach since the layerN class ptr’s this is created in that class’ method, which in context, is not visible to the compiler as being associated with the outer layer class’ member ptr.

  1. Internal compiler error. c++ specific feature not expected

This is an internal compiler error (i.e. a compiler bug), so there isn’t a web page with the error codes.

Can you post or send to PGI Customer Service (trs@pgroup.com) are reproducing example so we can diagnose the problem and hopefully get it fixed? Please let us know which compiler version and platform (x86 or Power) your using as well.

One work around you might try is to use the API call to “acc_attach” instead of the pragma version.

#ifdef _OPENACC
#include <openacc.h>
#endif 
....
acc_attach((void**) &ptr);

-Mat

Thanks a lot, Mat, for the detailed explanations with examples!

The workaround of using OpenACC API acc_attach works, and fixes the problem in the code. The cause was indeed related to ‘attach’: the pointer member in the parent class needs to be explicitly attached/associated with the parent class, because its creation is outside of the parent class’s context (i.e. in a child class). This is a very valuable lesson.

Thanks again,
Shine

The workaround of using OpenACC API acc_attach works, and fixes the problem in the code.

That’s great, but if you could still send us a reproducing example, we’d appreciate it. I don’t see any similar bugs so this one is new and I’d like to get it fixed.

Thanks,
Mat

Hi Mat,

Sure, I spent some time on it, and now I have eventually got a simpler code that reproduces the error. To be safe (the code is proprietary), I will discuss this with my boss as he comes back from his vacation the coming Monday (May 20). I will be in touch.

Thanks again for your help,
Shine

Hi Mat,

I have sent an e-mail with the small reproducer code to trs@pgroup.com, after talking with my boss. The title of the e-mail is “Reporting a potential PGI compiler bug with pragma acc enter data attach”. Thanks again for the help!

Cheers,
Shine

Thanks Shine!

I was able to reproduce the error with the attach clause here and filed a problem report (TPR#27182).

-Mat