Can any one help me with cuda unbound (cub setup)

I am using visual studio 2015. I have written a few cuda projects and programs and everything was working fine until my professors suggested that i use cub for improving performance on my final project. I unzipped cub library in nvidia tool kit. I have written my first cub function but during compilation i am getting this error when i try to call the cub function
BlockSumKernel<Block_threads, BLOCK_REDUCE_WARP_REDUCTIONS><<<dimGrid, dimBlock>>>(data, d_out);

Error operand types are incompatible (“void (*)(float *, float *)” and “int”)

Error expected an expression

I think its because compiler is not parsing cub calls properly ? but i might be wrong anyone has any idea ??
here is the cub setup instructions