How to set compute_52 and sm_52 on cmake

Dear all,

I’d like to use the Warp shuffle function like __shfl of cuda, but I have a nvcc compile error: “__shfl undefined”.
I’m using Quadro M6000 24GB and it’s compute capability is 5.2. So, I’d like to set compute_52 and sm_52 on cmake.
I tried to add the following code on my CMakeLists.txt:

set(CUDA_NVCC_FLAGS -gencode arch=compute_52,code=sm_52)

However, I still have the same compile error: “__shfl undefined”.

Are there any other ways to set the device compute capability on cmake?

OS: Windows 10
GPU: Quadro M6000 24GB
CUDA: 8.0
IDS: Visual Studio 2015

Thank you in advance.
Best,