I was able to build your code successfully for cc 5.2, release x64 project on windows, CUDA 7.5, with relocatable device code, here is my full VS command console output:
1>------ Rebuild All started: Project: t17, Configuration: Release x64 ------
1>
1> c:\Users\bob-tosh\documents\visual studio 2013\Projects\t17\t17>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\bin\nvcc.exe" -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include" --keep-dir x64\Release -maxrregcount=0 --machine 64 --compile -DWIN32 -DWIN64 -DNDEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Zi /MD " -o x64\Release\kernel.cu.obj "c:\Users\bob-tosh\documents\visual studio 2013\Projects\t17\t17\kernel.cu" -clean
1> kernel.cu
1> Compiling CUDA source file kernel.cu...
1>
1> c:\Users\bob-tosh\documents\visual studio 2013\Projects\t17\t17>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\bin\nvcc.exe" -gencode=arch=compute_52,code=\"sm_52,compute_52\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include" --keep-dir x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -DWIN32 -DWIN64 -DNDEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Zi /MD " -o x64\Release\kernel.cu.obj "c:\Users\bob-tosh\documents\visual studio 2013\Projects\t17\t17\kernel.cu"
1> kernel.cu
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(241): warning C4267: 'argument' : conversion from 'size_t' to 'thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>::size_type', possible loss of data
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(303): warning C4267: 'argument' : conversion from 'size_t' to 'thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>::size_type', possible loss of data
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(319): warning C4267: 'argument' : conversion from 'size_t' to 'thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>::size_type', possible loss of data
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(439): warning C4267: 'argument' : conversion from 'size_t' to 'thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>::size_type', possible loss of data
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(250): warning C4267: 'argument' : conversion from 'size_t' to 'thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>,0x00>::size_type', possible loss of data
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(625) : see reference to function template instantiation 'thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>,0x00> thrust::system::cuda::detail::bulk_::par<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>>(ExecutionAgent,size_t)' being compiled
1> with
1> [
1> ExecutionAgent=thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>
1> ]
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/cuda_launcher/cuda_launcher.hpp(155): warning C4267: 'return' : conversion from 'size_t' to 'int', possible loss of data
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/cuda_launcher/cuda_launcher.hpp(148) : while compiling class template member function 'int thrust::system::cuda::detail::bulk_::detail::cuda_launcher_base<0,thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>,0x00>,Closure>::choose_group_size(int)'
1> with
1> [
1> Closure=thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::for_each_n_detail::for_each_kernel,thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<0>,thrust::device_ptr<unsigned __int64>,thrust::detail::wrapped_function<thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned __int64>>,void>,unsigned int,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/cuda_launcher/cuda_launcher.hpp(305) : see reference to function template instantiation 'int thrust::system::cuda::detail::bulk_::detail::cuda_launcher_base<0,thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>,0x00>,Closure>::choose_group_size(int)' being compiled
1> with
1> [
1> Closure=thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::for_each_n_detail::for_each_kernel,thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<0>,thrust::device_ptr<unsigned __int64>,thrust::detail::wrapped_function<thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned __int64>>,void>,unsigned int,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/cuda_launcher/cuda_launcher.hpp(228) : see reference to class template instantiation 'thrust::system::cuda::detail::bulk_::detail::cuda_launcher_base<0,thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>,0x00>,Closure>' being compiled
1> with
1> [
1> Closure=thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::for_each_n_detail::for_each_kernel,thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<0>,thrust::device_ptr<unsigned __int64>,thrust::detail::wrapped_function<thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned __int64>>,void>,unsigned int,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/choose_sizes.inl(41) : see reference to class template instantiation 'thrust::system::cuda::detail::bulk_::detail::cuda_launcher<thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>,0x00>,Closure>' being compiled
1> with
1> [
1> Closure=thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::for_each_n_detail::for_each_kernel,thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<0>,thrust::device_ptr<unsigned __int64>,thrust::detail::wrapped_function<thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned __int64>>,void>,unsigned int,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/choose_sizes.inl(96) : see reference to function template instantiation 'thrust::pair<int,int> thrust::system::cuda::detail::bulk_::detail::choose_sizes<thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::for_each_n_detail::for_each_kernel,thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<0>,thrust::device_ptr<T>,thrust::detail::wrapped_function<thrust::detail::device_generate_functor<thrust::detail::fill_functor<T>>,void>,unsigned int,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>>>(thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>,0x00>,Closure)' being compiled
1> with
1> [
1> T=unsigned __int64
1> , Closure=thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::for_each_n_detail::for_each_kernel,thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<0>,thrust::device_ptr<unsigned __int64>,thrust::detail::wrapped_function<thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned __int64>>,void>,unsigned int,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/for_each.inl(125) : see reference to function template instantiation 'thrust::pair<int,int> thrust::system::cuda::detail::bulk_::choose_sizes<thrust::system::cuda::detail::for_each_n_detail::for_each_kernel,thrust::system::cuda::detail::bulk_::detail::cursor<0>,RandomAccessIterator,thrust::detail::wrapped_function<thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned __int64>>,void>,unsigned int>(thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>,0x00>,Function,Arg1,Arg2,Arg3,Arg4)' being compiled
1> with
1> [
1> RandomAccessIterator=thrust::device_ptr<unsigned __int64>
1> , Function=thrust::system::cuda::detail::for_each_n_detail::for_each_kernel
1> , Arg1=thrust::system::cuda::detail::bulk_::detail::cursor<0>
1> , Arg2=thrust::device_ptr<unsigned __int64>
1> , Arg3=thrust::detail::wrapped_function<thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned __int64>>,void>
1> , Arg4=unsigned int
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/detail/allocator/allocator_traits.inl(249) : while compiling class template member function 'void thrust::detail::allocator_traits<Alloc>::deallocate(thrust::detail::no_throw_allocator<thrust::detail::temporary_allocator<T,System>> &,thrust::pointer<unsigned __int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>,unsigned __int64)'
1> with
1> [
1> Alloc=thrust::detail::no_throw_allocator<thrust::detail::temporary_allocator<unsigned __int64,thrust::system::cuda::detail::tag>>
1> , T=unsigned __int64
1> , System=thrust::system::cuda::detail::tag
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/detail/contiguous_storage.inl(172) : see reference to function template instantiation 'void thrust::detail::allocator_traits<Alloc>::deallocate(thrust::detail::no_throw_allocator<thrust::detail::temporary_allocator<T,System>> &,thrust::pointer<unsigned __int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>,unsigned __int64)' being compiled
1> with
1> [
1> Alloc=thrust::detail::no_throw_allocator<thrust::detail::temporary_allocator<unsigned __int64,thrust::system::cuda::detail::tag>>
1> , T=unsigned __int64
1> , System=thrust::system::cuda::detail::tag
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/detail/contiguous_storage.inl(169) : while compiling class template member function 'void thrust::detail::contiguous_storage<T,thrust::detail::no_throw_allocator<thrust::detail::temporary_allocator<T,System>>>::deallocate(void)'
1> with
1> [
1> T=unsigned __int64
1> , System=thrust::system::cuda::detail::tag
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/detail/contiguous_storage.inl(64) : see reference to function template instantiation 'void thrust::detail::contiguous_storage<T,thrust::detail::no_throw_allocator<thrust::detail::temporary_allocator<T,System>>>::deallocate(void)' being compiled
1> with
1> [
1> T=unsigned __int64
1> , System=thrust::system::cuda::detail::tag
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/detail/contiguous_storage.inl(38) : while compiling class template member function 'thrust::detail::contiguous_storage<T,thrust::detail::no_throw_allocator<thrust::detail::temporary_allocator<T,System>>>::contiguous_storage(const thrust::detail::no_throw_allocator<thrust::detail::temporary_allocator<T,System>> &)'
1> with
1> [
1> T=unsigned __int64
1> , System=thrust::system::cuda::detail::tag
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/detail/temporary_array.inl(131) : see reference to function template instantiation 'thrust::detail::contiguous_storage<T,thrust::detail::no_throw_allocator<thrust::detail::temporary_allocator<T,System>>>::contiguous_storage(const thrust::detail::no_throw_allocator<thrust::detail::temporary_allocator<T,System>> &)' being compiled
1> with
1> [
1> T=unsigned __int64
1> , System=thrust::system::cuda::detail::tag
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/detail/contiguous_storage.inl(90) : while compiling class template member function 'thrust::detail::normal_iterator<thrust::pointer<unsigned __int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>> thrust::detail::contiguous_storage<T,thrust::detail::no_throw_allocator<thrust::detail::temporary_allocator<T,System>>>::begin(void)'
1> with
1> [
1> T=unsigned __int64
1> , System=thrust::system::cuda::detail::tag
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/detail/generic/remove.inl(89) : see reference to function template instantiation 'thrust::detail::normal_iterator<thrust::pointer<unsigned __int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>> thrust::detail::contiguous_storage<T,thrust::detail::no_throw_allocator<thrust::detail::temporary_allocator<T,System>>>::begin(void)' being compiled
1> with
1> [
1> T=unsigned __int64
1> , System=thrust::system::cuda::detail::tag
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/detail/temporary_array.h(38) : see reference to class template instantiation 'thrust::detail::contiguous_storage<T,thrust::detail::no_throw_allocator<thrust::detail::temporary_allocator<T,System>>>' being compiled
1> with
1> [
1> T=unsigned __int64
1> , System=thrust::system::cuda::detail::tag
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/detail/generic/remove.inl(86) : see reference to class template instantiation 'thrust::detail::temporary_array<unsigned __int64,DerivedPolicy>' being compiled
1> with
1> [
1> DerivedPolicy=thrust::system::cuda::detail::tag
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/detail/remove.inl(76) : see reference to function template instantiation 'ForwardIterator thrust::system::detail::generic::remove_if<thrust::system::cuda::detail::tag,ForwardIterator,Predicate>(thrust::execution_policy<thrust::system::cuda::detail::tag> &,ForwardIterator,ForwardIterator,Predicate)' being compiled
1> with
1> [
1> ForwardIterator=thrust::device_ptr<unsigned __int64>
1> , Predicate=is2to63
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/detail/remove.inl(181) : see reference to function template instantiation 'ForwardIterator thrust::remove_if<DerivedPolicy,ForwardIterator,Predicate>(const thrust::detail::execution_policy_base<DerivedPolicy> &,ForwardIterator,ForwardIterator,Predicate)' being compiled
1> with
1> [
1> ForwardIterator=thrust::device_ptr<unsigned __int64>
1> , DerivedPolicy=thrust::system::cuda::detail::tag
1> , Predicate=is2to63
1> ]
1> c:/Users/bob-tosh/documents/visual studio 2013/Projects/t17/t17/kernel.cu(26) : see reference to function template instantiation 'ForwardIterator thrust::remove_if<thrust::device_ptr<T>,is2to63>(ForwardIterator,ForwardIterator,Predicate)' being compiled
1> with
1> [
1> ForwardIterator=thrust::device_ptr<unsigned __int64>
1> , T=unsigned __int64
1> , Predicate=is2to63
1> ]
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/cuda_launcher/cuda_launcher.hpp(85): warning C4267: 'return' : conversion from 'size_t' to 'int', possible loss of data
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/cuda_launcher/cuda_launcher.hpp(84) : while compiling class template member function 'int thrust::system::cuda::detail::bulk_::detail::cuda_launcher_base<0,thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>,0x00>,Closure>::max_active_blocks_per_multiprocessor(const thrust::system::cuda::detail::bulk_::detail::device_properties_t &,const thrust::system::cuda::detail::bulk_::detail::function_attributes_t &,int,int)'
1> with
1> [
1> Closure=thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::for_each_n_detail::for_each_kernel,thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<0>,thrust::device_ptr<unsigned __int64>,thrust::detail::wrapped_function<thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned __int64>>,void>,unsigned int,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/cuda_launcher/cuda_launcher.hpp(96) : see reference to function template instantiation 'int thrust::system::cuda::detail::bulk_::detail::cuda_launcher_base<0,thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>,0x00>,Closure>::max_active_blocks_per_multiprocessor(const thrust::system::cuda::detail::bulk_::detail::device_properties_t &,const thrust::system::cuda::detail::bulk_::detail::function_attributes_t &,int,int)' being compiled
1> with
1> [
1> Closure=thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::for_each_n_detail::for_each_kernel,thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<0>,thrust::device_ptr<unsigned __int64>,thrust::detail::wrapped_function<thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned __int64>>,void>,unsigned int,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>>
1> ]
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/detail/pair.inl(46): warning C4267: 'initializing' : conversion from 'size_t' to 'int', possible loss of data
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/cuda_launcher/cuda_launcher.hpp(101) : see reference to function template instantiation 'thrust::pair<int,int>::pair<size_t,int>(const thrust::pair<size_t,int> &)' being compiled
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/cuda_launcher/cuda_launcher.hpp(101) : see reference to function template instantiation 'thrust::pair<int,int>::pair<size_t,int>(const thrust::pair<size_t,int> &)' being compiled
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/cuda_launcher/cuda_launcher.hpp(94) : while compiling class template member function 'thrust::pair<int,int> thrust::system::cuda::detail::bulk_::detail::cuda_launcher_base<0,thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>,0x00>,Closure>::dynamic_smem_occupancy_limit(const thrust::system::cuda::detail::bulk_::detail::device_properties_t &,const thrust::system::cuda::detail::bulk_::detail::function_attributes_t &,int,int)'
1> with
1> [
1> Closure=thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::for_each_n_detail::for_each_kernel,thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<0>,thrust::device_ptr<unsigned __int64>,thrust::detail::wrapped_function<thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned __int64>>,void>,unsigned int,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/cuda_launcher/cuda_launcher.hpp(119) : see reference to function template instantiation 'thrust::pair<int,int> thrust::system::cuda::detail::bulk_::detail::cuda_launcher_base<0,thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x01>,0x00>,0x00>,Closure>::dynamic_smem_occupancy_limit(const thrust::system::cuda::detail::bulk_::detail::device_properties_t &,const thrust::system::cuda::detail::bulk_::detail::function_attributes_t &,int,int)' being compiled
1> with
1> [
1> Closure=thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::for_each_n_detail::for_each_kernel,thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<0>,thrust::device_ptr<unsigned __int64>,thrust::detail::wrapped_function<thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned __int64>>,void>,unsigned int,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>>
1> ]
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(458): warning C4267: 'argument' : conversion from 'size_t' to 'thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0200>::size_type', possible loss of data
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(667) : see reference to function template instantiation 'thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0200> thrust::system::cuda::detail::bulk_::con<0x0200,0x03>(size_t)' being compiled
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/scan.inl(235) : see reference to function template instantiation 'thrust::system::cuda::detail::bulk_::async_launch<thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0200>,0x00>> thrust::system::cuda::detail::bulk_::grid<0x0200,3>(size_t,size_t,cudaStream_t)' being compiled
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/scan.inl(406) : see reference to function template instantiation 'OutputIterator thrust::system::cuda::detail::scan_detail::inclusive_scan<thrust::system::cuda::detail::tag,InputIterator,OutputIterator,AssociativeOperator>(thrust::system::cuda::detail::execution_policy<thrust::system::cuda::detail::tag> &,InputIterator,InputIterator,OutputIterator,AssociativeOperator)' being compiled
1> with
1> [
1> OutputIterator=thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>
1> , InputIterator=thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>
1> , AssociativeOperator=thrust::plus<__int64>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/iterator/iterator_facade.h(309) : while compiling class template member function 'thrust::reference<Element,thrust::pointer<Element,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>,thrust::use_default> thrust::iterator_facade<Derived,__int64,thrust::system::cuda::detail::tag,thrust::random_access_traversal_tag,thrust::reference<Element,thrust::pointer<Element,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>,thrust::use_default>,__int64>::operator *(void) const'
1> with
1> [
1> Element=__int64
1> , Derived=thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/iterator/iterator_facade.h(328) : see reference to function template instantiation 'thrust::reference<Element,thrust::pointer<Element,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>,thrust::use_default> thrust::iterator_facade<Derived,__int64,thrust::system::cuda::detail::tag,thrust::random_access_traversal_tag,thrust::reference<Element,thrust::pointer<Element,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>,thrust::use_default>,__int64>::operator *(void) const' being compiled
1> with
1> [
1> Element=__int64
1> , Derived=thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/iterator/iterator_adaptor.h(121) : see reference to class template instantiation 'thrust::iterator_facade<Derived,__int64,thrust::system::cuda::detail::tag,thrust::random_access_traversal_tag,thrust::reference<Element,thrust::pointer<Element,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>,thrust::use_default>,__int64>' being compiled
1> with
1> [
1> Derived=thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>
1> , Element=__int64
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/iterator/detail/normal_iterator.h(36) : see reference to class template instantiation 'thrust::iterator_adaptor<thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>,Pointer,thrust::use_default,thrust::use_default,thrust::use_default,thrust::use_default,thrust::use_default>' being compiled
1> with
1> [
1> Pointer=thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/detail/tuple.inl(256) : see reference to class template instantiation 'thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>' being compiled
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/detail/tuple.inl(257) : see reference to class template instantiation 'thrust::detail::cons<T0,thrust::detail::cons<__int64,thrust::detail::cons<T0,thrust::detail::cons<thrust::plus<__int64>,thrust::detail::map_tuple_to_cons<thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>::type>>>>' being compiled
1> with
1> [
1> T0=thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/tuple.h(197) : see reference to class template instantiation 'thrust::detail::cons<T0,thrust::detail::cons<thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>,thrust::detail::cons<__int64,thrust::detail::cons<thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>,thrust::detail::cons<thrust::plus<__int64>,thrust::detail::map_tuple_to_cons<thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>::type>>>>>' being compiled
1> with
1> [
1> T0=thrust::system::cuda::detail::bulk_::detail::cursor<1>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/closure.hpp(70) : see reference to class template instantiation 'thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<1>,thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>,__int64,thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>,thrust::plus<__int64>,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>' being compiled
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/cuda_task.hpp(61) : see reference to class template instantiation 'thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::scan_detail::inclusive_scan_n,thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<1>,thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>,__int64,thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>,thrust::plus<__int64>,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>>' being compiled
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/detail/cuda_task.hpp(202) : see reference to class template instantiation 'thrust::system::cuda::detail::bulk_::detail::task_base<thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0200>,0x00>,Closure>' being compiled
1> with
1> [
1> Closure=thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::scan_detail::inclusive_scan_n,thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<1>,thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>,__int64,thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>,thrust::plus<__int64>,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>>
1> ]
1> c:\users\bob-tosh\appdata\local\temp\tmpxft_00000bf0_00000000-2_kernel.cudafe1.stub.c(119) : see reference to class template instantiation 'thrust::system::cuda::detail::bulk_::detail::cuda_task<thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0200>,0x00>,thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::scan_detail::inclusive_scan_n,thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<1>,thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>,__int64,thrust::detail::normal_iterator<thrust::pointer<__int64,thrust::system::cuda::detail::tag,thrust::use_default,thrust::use_default>>,thrust::plus<__int64>,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type,thrust::null_type>>>' being compiled
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(458): warning C4267: 'argument' : conversion from 'size_t' to 'thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x09>,0x080>::size_type', possible loss of data
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(667) : see reference to function template instantiation 'thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x09>,0x080> thrust::system::cuda::detail::bulk_::con<0x080,0x09>(size_t)' being compiled
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/scan.inl(259) : see reference to function template instantiation 'thrust::system::cuda::detail::bulk_::async_launch<thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x09>,0x080>,0x00>> thrust::system::cuda::detail::bulk_::grid<0x080,0x09>(size_t,size_t,cudaStream_t)' being compiled
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(458): warning C4267: 'argument' : conversion from 'size_t' to 'thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0100>::size_type', possible loss of data
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(667) : see reference to function template instantiation 'thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0100> thrust::system::cuda::detail::bulk_::con<0x0100,0x03>(size_t)' being compiled
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/scan.inl(267) : see reference to function template instantiation 'thrust::system::cuda::detail::bulk_::async_launch<thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0100>,0x00>> thrust::system::cuda::detail::bulk_::grid<0x0100,0x03>(size_t,size_t,cudaStream_t)' being compiled
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(250): warning C4267: 'argument' : conversion from 'size_t' to 'thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0200>,0x00>::size_type', possible loss of data
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(311) : see reference to function template instantiation 'thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0200>,0x00> thrust::system::cuda::detail::bulk_::par<ExecutionAgent>(ExecutionAgent,size_t)' being compiled
1> with
1> [
1> ExecutionAgent=thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0200>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(667) : see reference to function template instantiation 'thrust::system::cuda::detail::bulk_::async_launch<thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0200>,0x00>> thrust::system::cuda::detail::bulk_::par<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0200>>(cudaStream_t,ExecutionAgent,size_t)' being compiled
1> with
1> [
1> ExecutionAgent=thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0200>
1> ]
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(250): warning C4267: 'argument' : conversion from 'size_t' to 'thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x09>,0x080>,0x00>::size_type', possible loss of data
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(311) : see reference to function template instantiation 'thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x09>,0x080>,0x00> thrust::system::cuda::detail::bulk_::par<ExecutionAgent>(ExecutionAgent,size_t)' being compiled
1> with
1> [
1> ExecutionAgent=thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x09>,0x080>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(667) : see reference to function template instantiation 'thrust::system::cuda::detail::bulk_::async_launch<thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x09>,0x080>,0x00>> thrust::system::cuda::detail::bulk_::par<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x09>,0x080>>(cudaStream_t,ExecutionAgent,size_t)' being compiled
1> with
1> [
1> ExecutionAgent=thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x09>,0x080>
1> ]
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(250): warning C4267: 'argument' : conversion from 'size_t' to 'thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0100>,0x00>::size_type', possible loss of data
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(311) : see reference to function template instantiation 'thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0100>,0x00> thrust::system::cuda::detail::bulk_::par<ExecutionAgent>(ExecutionAgent,size_t)' being compiled
1> with
1> [
1> ExecutionAgent=thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0100>
1> ]
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\thrust/system/cuda/detail/bulk/execution_policy.hpp(667) : see reference to function template instantiation 'thrust::system::cuda::detail::bulk_::async_launch<thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0100>,0x00>> thrust::system::cuda::detail::bulk_::par<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0100>>(cudaStream_t,ExecutionAgent,size_t)' being compiled
1> with
1> [
1> ExecutionAgent=thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<0x03>,0x0100>
1> ]
1> LINK : /LTCG specified but no code generation required; remove /LTCG from the link command line to improve linker performance
1> t17.vcxproj -> c:\Users\bob-tosh\documents\visual studio 2013\Projects\t17\x64\Release\t17.exe
1> copy "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\bin\cudart*.dll" "c:\Users\bob-tosh\documents\visual studio 2013\Projects\t17\x64\Release\"
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\bin\cudart32_75.dll
1> C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\bin\cudart64_75.dll
1> 2 file(s) copied.
========== Rebuild All: 1 succeeded, 0 failed, 0 skipped ==========
I opened a new CUDA 7.5 runtime project, dropped your code into the existing kernel.cu, replacing the code that was there, then changed the project settings:
- change from debug/win32 to release/x64
- In project properties, turn on “generate relocatable device code” and also change from project default of compute_20,sm_20 to compute_52,sm_52 (and turn off inherit from project default)
Then I did rebuild project.