Our Main software suite uses g++. I want to compile some code with nvc++ -stdpar=gpu as a shared library and have the main software suite link to the accelerated code. But that doesn’t seem to work with when compiling stdpar shared library via an obj file.
Everything works fine if i compile the shared library as a single command:
$ nvc++ -o libfor_each.so for_each_lib.cpp -shared -fPIC -stdpar=gpu
$ g++ for_each_main.cpp -L. -lfor_each -o for_each.exe
$ LD_LIBRARY_PATH=. ./for_each.exe
But if I compile using an object file it doesn’t link properly
$ nvc++ --gcc-toolchain=/cvmfs/icecube.opensciencegrid.org/py3-v4.4.0/RHEL_7_x86_64_v2 -stdpar=gpu -fPIC -o for_each.cpp.o -c for_each.cpp
$ nvc++ -o libfor_each.so for_each.cpp.o -shared -fPIC --gcc-toolchain=/cvmfs/icecube.opensciencegrid.org/py3-v4.4.0/RHEL_7_x86_64_v2
$ g++ for_each_main.cpp -L. -lfor_each -o for_each.exe
ld: ./libfor_each.so: undefined reference to `cudaGetDeviceCount'
ld: ./libfor_each.so: undefined reference to `cudaFree'
ld: ./libfor_each.so: undefined reference to `cudaPeekAtLastError'
ld: ./libfor_each.so: undefined reference to `Mcuda_compiled'
ld: ./libfor_each.so: undefined reference to `cudaGetDevice'
ld: ./libfor_each.so: undefined reference to `__cudaRegisterFunction'
ld: ./libfor_each.so: undefined reference to `cudaDeviceGetAttribute'
ld: ./libfor_each.so: undefined reference to `cudaFuncGetAttributes'
ld: ./libfor_each.so: undefined reference to `__pgiLaunchKernelFromStub'
ld: ./libfor_each.so: undefined reference to `cudaGetErrorName'
ld: ./libfor_each.so: undefined reference to `__pgi_cuda_register_fat_binaryA'
ld: ./libfor_each.so: undefined reference to `cudaSetDevice'
ld: ./libfor_each.so: undefined reference to `cudaStreamSynchronize'
ld: ./libfor_each.so: undefined reference to `cudaMalloc'
ld: ./libfor_each.so: undefined reference to `__cudaRegisterVar'
ld: ./libfor_each.so: undefined reference to `__cudaPushCallConfiguration'
ld: ./libfor_each.so: undefined reference to `cudaGetLastError'
ld: ./libfor_each.so: undefined reference to `cudaGetErrorString'
collect2: error: ld returned 1 exit status
Are object files not allowed for stdpar? I don’t necessarily need to use them but I am using a build system that assumes everything is compiled with object files and it is annoying to work around this. I don’t think it matters but these are the files i am useing:
for_each_lib.cpp:
#include <algorithm>
#include <execution>
using namespace std;
struct mul {
void operator()(float& x) const {
for (size_t i=0; i<0xFFFFF; i++){
x=x*1.0000001f; }
}
};
void multiply_all(std::vector<float>& v) {
// copy into a vector allocated on the heap
vector<float> v1(v);
//execute kernel
std::for_each(
std::execution::par_unseq,
v1.begin(), v1.end(), mul{});
// copy result back into
std::copy (v1.begin(), v1.end(), v.begin());
}
for_each_main.cpp:
#include <vector>
#include <iostream>
using namespace std;
void multiply_all(std::vector<float>& v);
int main(){
size_t N = 1000;
vector<float> v1(N);
for (size_t i=0; i<N; i++){
v1[i]=float(i)/2;
}
cout << "\n";
multiply_all(v1);
for (size_t i=0; i<N; i+=N/10){
cout << i << " " << v1[i] << "\n";
}
cout<< "\n";
}