std::transform_reduce() without std::execution are missing

t3840 · December 21, 2019, 8:26am

In PGI 19.10 community edition, some overloaded functions of std::transform_reduce() are missing while PGI claims C++17 are fully supported.
More specifically, implementations of (1),(2), and (3) in the following website are missing.
https://en.cppreference.com/w/cpp/algorithm/transform_reduce

Progam I compiled is …

#include <iostream>
#include <vector>
#include <numeric>

int main()
{
  const std::vector<int> v1 = {1, 2, 3, 4, 5};
  const std::vector<int> v2 = {2, 3, 4, 5, 6};

  // (1) : 2つのリストを集計する
  // sum1 = 1*2 + 2*3 + 3*4 + 4*5, 5*6
  int sum1 = std::transform_reduce(v1.begin(), v1.end(), v2.begin(), 0);
  std::cout << "sum1 : " << sum1 << std::endl;

  // (2) : 2つのリストを集計する。
  // リストを集計する2項演算と、2つのリストの要素を掛け合わせる2項演算を指定する
  int sum2 = std::transform_reduce(v1.begin(), v1.end(), v2.begin(), 0,
                                  [](int a, int b) { return a + b; },  // 集計関数
                                  [](int a, int b) { return a * b; }); // 2つの要素を合成する関数
  std::cout << "sum2 : " << sum2 << std::endl;

  // (3) : リストの各要素を変換しながら集計する
  // 1*2 + 2*2 + 3*2 + 4*2 + 5*2
  int sum3 = std::transform_reduce(v1.begin(), v1.end(), 0,
                                   [](int acc, int i) { return acc + i; }, // 集計関数
                                   [](int x) { return x * 2; });           // 変換関数
  std::cout << "sum3 : " << sum3 << std::endl;
}

And, PGI compiler spits the error described below.

"./cpprefjp_transform_reduce.cpp", line 14: error: no instance of overloaded
          function "std::transform_reduce" matches the argument list
            argument types are: (__gnu_cxx::__normal_iterator<const int *,
                      std::vector<int, std::allocator<int>>>,
                      __gnu_cxx::__normal_iterator<const int *,
                      std::vector<int, std::allocator<int>>>,
                      __gnu_cxx::__normal_iterator<const int *,
                      std::vector<int, std::allocator<int>>>, int)
        int sum1 = std::transform_reduce(v1.begin(), v1.end(), v2.begin(), 0);

Note that, when I added an execution policy (std::execution::seq or par) to the first argument, functions are working as I expected.

Best,
Miya

MatColgrove · December 27, 2019, 9:52pm

Hi Miya,

Not unexpected. In order to gain object compatibility with g++, we need to use their STL. Given g++ 9.2 doesn’t support this in their STL, we unfortunately don’t as well. The easy work around is to include a seq execution parameter. For example:

% cat transform_reduce.cpp
#include <iostream>
#include <vector>
#include <numeric>
#include <execution>

int main()
{
  const std::vector<int> v1 = {1, 2, 3, 4, 5};
  const std::vector<int> v2 = {2, 3, 4, 5, 6};

  int sum1 = std::transform_reduce(std::execution::seq, v1.begin(), v1.end(), v2.begin(), 0);
  std::cout << "sum1 : " << sum1 << std::endl;

  int sum2 = std::transform_reduce(std::execution::seq, v1.begin(), v1.end(), v2.begin(), 0,
                                  [](int a, int b) { return a + b; },
                                  [](int a, int b) { return a * b; });
  std::cout << "sum2 : " << sum2 << std::endl;

  int sum3 = std::transform_reduce(std::execution::seq, v1.begin(), v1.end(), 0,
                                   [](int acc, int i) { return acc + i; },
                                   [](int x) { return x * 2; });
  std::cout << "sum3 : " << sum3 << std::endl;
}
% pgc++ -std=c++17 transform_reduce.cpp; a.out
sum1 : 70
sum2 : 70
sum3 : 30

Hope this helps,
Mat

zvyagin · December 28, 2019, 12:11pm

I don’t see any differences in using different policies with the PGI compiler. This is a slightly modified code example of the thread.

255566 513997 462758 466801
247111 513928 462873 467035
252264 518074 462592 466330

#include <iostream>
#include <vector>
#include <numeric>
#include <execution>
#include <chrono>

template<typename Policy>
float run (size_t N,Policy policy) {

    auto t0 = std::chrono::high_resolution_clock::now();
    std::vector<float> v1(N), v2(N);
    for( unsigned i=0; i<N; i++ )
        v1[i] = v2[i] = i;
    
    auto t1 = std::chrono::high_resolution_clock::now();
    auto sum1 = std::transform_reduce(policy, v1.begin(), v1.end(), v2.begin(), 0);

    auto t2 = std::chrono::high_resolution_clock::now();
    auto sum2 = std::transform_reduce(policy, v1.begin(), v1.end(), v2.begin(), 0,
                                  [](auto a, auto b) { return a + b; },
                                  [](auto a, auto b) { return a * b; });

    auto t3 = std::chrono::high_resolution_clock::now();
    auto sum3 = std::transform_reduce(policy, v1.begin(), v1.end(), 0,
                                   [](auto acc, auto i) { return acc + i; },
                                   [](auto x) { return x * 2; });

    auto t4 = std::chrono::high_resolution_clock::now();

    auto d = [] (auto a,auto b) {return std::chrono::duration_cast<std::chrono::microseconds>(b-a).count();};
    std::cout << d(t0,t1) << " " << d(t1,t2) << " " << d(t2,t3) << " " << d(t3,t4) << "\n";
    return sum1+sum2+sum3;
}

int main(void) {
    constexpr unsigned N = 1024*1024*100;
    run(N,std::execution::seq);
    run(N,std::execution::par);
    run(N,std::execution::par_unseq);
    return 0;
}

MatColgrove · December 30, 2019, 4:49pm

Correct, the parallel execution model is still in development and will be released as a beta feature early next year, with the production release later. It will also include implicit offload to the GPU. Multicore CPU with use TBB (like g++ does).

For a preview of the GPU enabled C++ standard language parallelism, please see David Olsen’s talk from SC19:

-Mat

t3840 · January 6, 2020, 8:11am

Hi mkcolg,

Thank you for your kind reply and explanations :)
I also found the same thing on g++ as you have mentioned.

Although the reason comes from g++, I hope that the PGI will correctly support std::transform_reduce().

Best,
Miya

Topic		Replies	Views
[Help] Using reduction with Array Legacy PGI Compilers	14	3216	March 21, 2024
FATAL ERROR at run time Legacy PGI Compilers	5	8122	December 18, 2014
Compiling with C++ stdlib Procedures Legacy PGI Compilers	7	9734	January 7, 2015
LLVM Error when compiling C++ STD parallel execution policies to GPU nvc, nvc++ and nvfortran	9	516	May 2, 2024
c++-14 Legacy PGI Compilers	7	3478	December 28, 2018
PGI C++11 support Legacy PGI Compilers	6	5631	October 31, 2017
error with derived types in PGI CUDA 10.4 Legacy PGI Compilers	8	13149	May 11, 2010
Can PGI C++ support boost? Pleeeaase... Legacy PGI Compilers	8	15381	January 9, 2009
pgc++ 18.10 cannot compile shared_ptr Legacy PGI Compilers	3	1892	December 18, 2018
Compilation problems for loop parallelization Legacy PGI Compilers	8	4521	May 21, 2012

std::transform_reduce() without std::execution are missing

Related topics