I’d like to use the Warp shuffle function like __shfl of cuda, but I have a nvcc compile error: “__shfl undefined”.
I’m using Quadro M6000 24GB and it’s compute capability is 5.2. So, I’d like to set compute_52 and sm_52 on cmake.
I tried to add the following code on my CMakeLists.txt:
set(CUDA_NVCC_FLAGS -gencode arch=compute_52,code=sm_52)
However, I still have the same compile error: “__shfl undefined”.
Are there any other ways to set the device compute capability on cmake?
OS: Windows 10
GPU: Quadro M6000 24GB
IDS: Visual Studio 2015
Thank you in advance.