Using OpenACC with C++ class member functions that have been compiled to static or shared libraries

I have a library of functions declared in a .h file and defined in a .cpp file, with some functions that have been decorated with #pragma acc routine seq information. When I compile the .cpp into a static (.a) or shared library (.so), then if I want to use these device library functions in a different .cpp file, I can simply include the .h file and forward declare the functions at the top of the .cpp file with #pragma acc routine seq information.

However, if I instead have a class, with various member functions, that is compiled to a static or shared library, then I cannot use the above method (i.e. forward declaring functions with #pragma acc routine seq information), since I then get an error saying that I cannot redefine functions. Thus, the only way I’ve currently been able to work around this problem is to not compile the class to a static or shared library, and instead simply define all of its functions in the header itself.

Is there a better solution to this problem? Ideally I’d like to be able to have a class compiled to a .a or .so file on disk, and still be able to link and use its #pragma acc routine seq functions in another .cpp file. Any help would be great, thanks!

Maybe I’m not understanding, but I’m not sure how you’re getting this to work with shared objects. CUDA doesn’t currently have dynamic loader so it’s not possible to have a compute region call a device routine contained in shared object.

Can you provide more detail or an example of what you’re doing?

However, if I instead have a class, with various member functions, that is compiled to a static or shared library, then I cannot use the above method (i.e. forward declaring functions with #pragma acc routine seq information), since I then get an error saying that I cannot redefine functions. Thus, the only way I’ve currently been able to work around this problem is to not compile the class to a static or shared library, and instead simply define all of its functions in the header itself.

Again, I think I need more details and an example to understand what you’re doing here.

-Mat

Hi Mat,

Apologies, I’ve actually only tried with static libraries, so my shared library assumptions are incorrect.

I’ve made the following 3 tests scenarios that I hope will clear things up a little. Cases 1 and 3 should compile and run without any errors, whereas case 2 should fail to compile due to missing acc routine information:

  1. Compile test_functions.cpp to a static library, and use its functions in test_executable.cpp (that is compiled to an executable), by forward declaring the functions defined in test_functions.cpp at the top of test_executable.cpp

The files for test scenario 1 have been uploaded here:
CMakeLists.txt (288 Bytes)
test_executable.cpp (730 Bytes)
test_functions.cpp (268 Bytes)
test_functions.h (134 Bytes)

  1. Compile test_class.cpp to a static library, and try to use the member functions defined inside that class in test_executable.cpp (that is compiled to an executable), by forward declaring the class member functions at the top of test_executable.cpp

The files for test scenario 2 have been uploaded here:
CMakeLists.txt (276 Bytes)
test_class.cpp (383 Bytes)
test_class.h (259 Bytes)
test_executable.cpp (895 Bytes)

  1. Declare and define TestClass inside a .h file (i.e. don’t compile it to a static library), and include this into test_executable.cpp (that is compiled to an executable).

The files for test scenario 3 have been uploaded here:
CMakeLists.txt (168 Bytes)
test_class.h (617 Bytes)
test_executable.cpp (781 Bytes)

My file structure for each test scenario was as follows:

/test_1
    /build
    CMakeLists.txt
    test_executable.cpp
    test_functions.cpp
    test_functions.h

/test_2
    /build
    CMakeLists.txt
    test_executable.cpp
    test_class.cpp
    test_class.h
    
/test_3
    /build
    test_executable.cpp
    test_class.h

In each case, the following commands were run inside each respective test_x folder:

cmake -S . -B build/ -DCMAKE_CXX_COMPILER=nvc++
cd build
make
./test_executable

Let me know if you need more info and I’ll be happy to supply it!

Yes, that’s what I’m seeing, but case 2 only fails because you need to add the routine directive to the prototypes in test_class.h.

Basically, the ‘routine’ directive needs to be visible from both the callee and the caller. The callee needs it so the compiler knows to create a device routines and the caller needs it so it knows that there’s a device routine that it can call.

While you can have routine in both the prototype and definition (though seem to recall that g++ doesn’t like this in their OpenACC implementation), I find it easier to just put it with the prototype since it’s visible to both.

I should note that the compiler is smart enough that it will implicitly add ‘routine seq’ to methods called within device code. The caveat being that the definition needs to be visible which is not the case in test2. Though this feature is especially useful for templated code since it would be very difficult to add ‘routine’ everywhere.

% cat ../test_class.h
#pragma once

#include <stddef.h>

class TestClass {
    public:
        TestClass();
        TestClass(const float test_num);
        ~TestClass();
#pragma acc routine
        void set_test_num(const float test_num);
#pragma acc routine
        void print_test_num();

        float test_num_;
};
% make
Scanning dependencies of target test_class
[ 25%] Building CXX object CMakeFiles/test_class.dir/test_class.cpp.o
TestClass::set_test_num(float):
     14, Generating acc routine seq
         Generating Tesla code
TestClass::print_test_num():
     19, Generating acc routine seq
         Generating Tesla code
[ 50%] Linking CXX static library libtest_class.a
[ 50%] Built target test_class
Scanning dependencies of target test_executable
[ 75%] Building CXX object CMakeFiles/test_executable.dir/test_executable.cpp.o
"/scratch/mcolgrove/jeffr1992/test2/test_executable.cpp", line 42: warning: expression has no effect
      test_executable;
      ^

main:
     45, Generating Tesla code
         Generating implicit copy(test_executable) [if not already present]
TestExecutable::TestExecutable():
     27, Generating enter data copyin(this[:1])
TestExecutable::~TestExecutable():
     31, Generating exit data delete(this[:1])
TestExecutable::do_something():
     34, Generating acc routine seq
         Generating Tesla code
[100%] Linking CXX executable test_executable
[100%] Built target test_executable
% ./test_executable
test_num_: 20.000000

Aha that worked like a charm, thanks very much Mat.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.