as described in Creating a shared library that utilises OpenMP offloading there was a bug in NVHPC that did not allow the usage of offloading in libraries. The last post from @MatColgrove claimed that this should work now with NVHPC 22.5 and asked for feedback if I do see errors.
Unfortunately my simple test did not work:
$ nvc++ --version
nvc++ 22.5-0 64-bit target on x86-64 Linux -tp zen2
NVIDIA Compilers and Tools
Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
$ nvc++ -g -O3 -std=c++17 -fpic -mp=gpu -shared test_compute.cpp -o libtest_compute.so
$ nvc++ -std=c++17 test.cpp -o test -L${PWD} -ltest_compute
$ LD_LIBRARY_PATH=${PWD}:$LD_LIBRARY_PATH OMP_TARGET_OFFLOAD=MANDATORY ./test
Fatal error: Could not run target region on device 0, execution terminated.
Aborted
Moreover, if I add -mp=gpu to the compilation of test.cpp, which used to work before, I now get a wired error:
$ nvc++ -std=c++17 -mp=gpu test.cpp -o test -L${PWD} -ltest_compute
$ LD_LIBRARY_PATH=${PWD}:$LD_LIBRARY_PATH OMP_TARGET_OFFLOAD=MANDATORY ./test
Fatal error: expression 'HX_CU_CALL_CHECK(p_cuStreamSynchronize(stream[dev]))' (value 1) is not equal to expression 'HX_SUCCESS' (value 0)
Aborted
I am unsure if this is an error on our site (we are running Driver Version: 470.57.02, CUDA Version: 11.4, A100 SMX) or if this should work. Please find my source code attached.
Yes, that’s correct when using nvc++ to link. The problem being that without -mp=gpu, the binary initialization is set to not use the GPU, over ridding the GPU initialization in the shared object.
If using g++ to link (and presumably with python), the shard object GPU initialization will kick in when the library is loaded.
Yes, that’s correct when using nvc++ to link. The problem being that without -mp=gpu, the binary initialization is set to not use the GPU, over ridding the GPU initialization in the shared object.
That makes sense, though it does not seem obvious to me.
If using g++ to link (and presumably with python), the shard object GPU initialization will kick in when the library is loaded.
I just double-checked and it works well. Thank you a lot 😄.