I tried to use thrust::transform_output_iterator in simple case:
#include <thrust/functional.h>
#include <thrust/transform.h>
#include <thrust/copy.h>
#include <thrust/iterator/transform_output_iterator.h>
#include <thrust/system/cuda/vector.h>
#include <thrust/device_vector.h>
#include <cstdio>
template< typename T >
#if 1
using container = thrust::device_vector< T >;
#else
using container = thrust::cuda::vector< T >;
#endif
int main()
{
container< int > input{5};
input[0] = 1;
input[1] = 2;
input[2] = 3;
input[3] = 4;
input[4] = 5;
container< int > stencil{input.size()};
stencil[0] = 0;
stencil[1] = 1;
stencil[2] = 0;
stencil[3] = 1;
stencil[4] = 0;
auto f = [] __host__ __device__ (int i) -> int
{
printf("%i\n", i);
return i * 100;
};
container< int > output{input.size()};
auto output_begin = thrust::make_transform_output_iterator(output.begin(), f);
using namespace thrust::placeholders;
thrust::copy_if(input.cbegin(), input.cend(), stencil.cbegin(), output_begin, _1 == 0);
}
but I cannot do this, because of error during compilation:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include\thrust/system/cuda/detail/copy_if.h(836): error : function "thrust::transform_output_iterator<UnaryFunction, OutputIterator>::operator=(const thrust::transform_output_iterator<lambda [](int)->int, thrust::detail::normal_iterator<thrust::cuda_cub::pointer<int>>> &) [with UnaryFunction=lambda [](int)->int, OutputIterator=thrust::detail::normal_iterator<thrust::cuda_cub::pointer<int>>]" (declared implicitly) cannot be referenced -- it is a deleted function
Is there something wrong with my code?
Compiler is VS2017. CUDA v10.1
If you use device_vector instead of cuda::vector and switch to using a functor instead of a lambda, your code compiles cleanly.
I find it annoying to have to hack together whatever include files are needed, and build an application around what you have shown.
My suggestion would be to provide a complete application that others can copy, paste, compile, and run, without having to add anything or change anything.
You’re welcome to do as you wish of course. I suspect that by not providing a code that makes it easy for others to get started with working with what you have shown, you are making it more difficult to get help. Just my opinion.
I’m not sure why you use thrust::cuda::vector
Direct system access shouldn’t generally be needed in thrust, and I’m not sure why you are using it here. 99% of thrust codes out there are using things like thrust::host_vector and thrust::device_vector.
The code in original message has been changed. git repository containing the code to reproduce is here: https://github.com/tomilov/mcve/blob/446d6c3e4bf05d28a4f3721398d19495a72f688f/thrust2/main.cu.
I specifically need thrust::cuda::vector, because I want to benchmark different backends for implementation of an algorithm: https://gist.github.com/tomilov/2738dc6c1a1afb7ebd2d9892b8a08321#file-sah_kd_tree-cu-L26-L40.
It is not an option: the body of operator () depends on types deduced in local scope. If I define the functor in local scope then I get an error:
error : A type defined inside a __host__ function ("to_splitted_event_pair") cannot be used in the template argument type of a __global__ function template instantiation
Surely technically it is possible to move the definition of the functor to outer scope and parametrize it by plenty of template parameters, but it is a too-much-boilerplate-code-way. Lambda is rather terse.
Is there theoretical cause of an inconsistency between transofrm_iterator and transform_output_iterator (functor parameter named AdaptableUnaryFunction for the first and UnaryFunction for the second – is there some concept behind the names)?