I’m trying to get a very simple pstl example working with GPU acceleration:
#include <stdio.h>
#include <stdlib.h>
#include <vector>
#include <execution>
#include <algorithm>
int main(void) {
std::vector<int> data(10000000);
std::fill_n(std::execution::par_unseq, data.begin(), data.size(), -1);
puts("Hello World!!!");
return EXIT_SUCCESS;
}
After looking around, I was able to find the specific set of compilation and linking arguments to get this compiling:
nvc++ -fast -g -Wall -stdpar -c -o nvidia_pstl_test.o nvidia_pstl_test.cpp
nvc++ -o nvidia_pstl_test nvidia_pstl_test.o -cuda -lcudanvhpc101
There are a couple of problems here:
-
The resulting binary segfaults.
user@user-linux:~/eclipse-workspace/nvidia_pstl_test$ ./nvidia_pstl_test Segmentation fault (core dumped)
I’m using a 1080Ti with proprietary NVIDIA drivers on Ubuntu 20.04. I have no idea what’s wrong, but here’s a stacktrace thanks to Eclipse:
user@user-linux:~/eclipse-workspace/nvidia_pstl_test$ nvidia-smi Wed Mar 17 11:54:53 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.39 Driver Version: 460.39 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce GTX 108... Off | 00000000:01:00.0 Off | N/A | | 0% 34C P8 13W / 300W | 205MiB / 11170MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 981 G /usr/lib/xorg/Xorg 60MiB | | 0 N/A N/A 1510 G /usr/lib/xorg/Xorg 125MiB | | 0 N/A N/A 1639 G /usr/bin/gnome-shell 9MiB | +-----------------------------------------------------------------------------+
-
I’m not sure if I missed it but not no where in the parstd docs does it talk about the -cuda flag OR having to link again cudanvhpc101. I spent a bunch of time googling and grepping to get it to a point where it doesn’t come back with a linker error.