I further modified my example from Lexicographic comparison of std::tuple on GPU fails with C++20 to use 3D coordinates. To avoid the previous problem I use C++17 with which the previous example worked on the GPU without a problem. Now when executing, I get
terminate called after throwing an instance of 'thrust::system::system_error'
what(): merge_sort: failed to synchronize: cudaErrorMisalignedAddress: misaligned address
I’m on “nvc++ 22.3-0 64-bit target on x86-64 Linux” with “gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0” on a GTX 1070. I compile with e.g.
nvc++ -O3 -std=c++17 -stdpar=gpu -gpu=cc61,cuda11.6
weld_vertices3D.cpp (3.4 KB)