I am receiving the following error when trying to compile code using thrust
’s inclusive_scan
algorithm:
/Developer/NVIDIA/CUDA-9.2/include/thrust/system/cuda/detail/cub/block/specializations/../../warp/specializations/warp_reduce_shfl.cuh(528): error: calling a \__host__ function("std::\__1::\__clz") from a \__device__ function("thrust::cuda_cub::cub::TilePrefixCallbackOp<int, ::thrust::plus<int> , ::thrust::cuda_cub::cub::ScanTileState<int, (bool)1> , (int)300> ::operator ()") is not allowed`
/Developer/NVIDIA/CUDA-9.2/include/thrust/system/cuda/detail/cub/block/specializations/../../warp/specializations/warp_reduce_shfl.cuh(528): error: identifier "std::__1::__clz" is undefined in device code
This code compiled correctly under a previous version of the OS (10.12.x) using CUDA 8.0.
nvcc -c -DCOMPILE_GPU -03 -m 64 -lineinfo
Not sure what has changed here - looks to me like a reference to __clz
is resolving to the std::
implementation rather than the cuda
version.
Any advice would be greatly appreciated!
The compiler command I am using is:
/Developer/NVIDIA/CUDA-9.2/bin/nvcc -c -DCOMPILE_GPU -O3 -m 64 -lineinfo -Xptxas -dlcm=ca -std=c++11 -gencode arch=compute_30,code=sm_30 -Xcompiler -stdlib=libc++ ...