Hi,
Please consider the following program:
// scan.cpp
#include <iostream>
#include <execution>
#include <numeric>
#include <ranges>
#include <vector>
#include <experimental/mdspan>
using std::experimental::mdspan;
int main()
{
constexpr int N = 1'000'000;
// in/out data
std::vector<double> u(N);
std::vector<double> v(N);
// initialise input data
std::iota(u.begin(), u.end(), 0);
// working OK
{
std::transform_inclusive_scan(
std::execution::par_unseq,
u.begin(), u.end(),
v.begin(),
std::plus<double>{},
[](double x) { return x - 1.0; }
);
std::cout << v.back() << std::endl;
}
// do the same with ranges (crashes)
{
// input (transform) range
auto r = std::views::iota(0, N) | std::views::transform([mu = mdspan{u.data(), N}](int i) { return mu[i] - 1.0; });
std::inclusive_scan(
std::execution::par_unseq,
r.begin(), r.end(),
v.begin(),
std::plus<double>{}
);
std::cout << v.back() << std::endl;
}
return 0;
}
I compile the program like this:
nvc++ -std=c++20 -stdpar -Minfo=stdpar scan.cpp -o scan
The second scan emulates the first by passing a transform range to inclusive_scan; it crashes (segmentation fault) when I compile it with nvc++ 23.9-0 (I’ve also tried older versions).
Initially, I thought the problem was the mdspan
, as it’s a host memory view; however, the program still crashes when I replace std::vector
with thrust::device_vector
:
thrust::device_vector<double> u(N);
// ...
auto r = std::views::iota(0, N) | std::views::transform([mu = mdspan{thrust::raw_pointer_cast(u.data()), N}](int i) { return mu[i] - 1.0; });
std::inclusive_scan(/*as before*/);
I’ve seen the crash on a Linux HPC system with NVIDIA A100 and on WSL2 on a workstation with NVIDIA GeForce RTX 3080 Ti.
Please note the following:
- The program also crashes when the output iterator is a transform view iterator.
transform_inclusive_scan
also crashes when given transform view iterators.
Can you please tell me why that is? I thought C++ ranges can be freely used with stdpar
algorithms.
Thanks,
Christos