I get the same problem. I posted my topic on stackoverflow before, but I deleted it as I thought this forum was a better location for my problem.
[url=https://devtalk.nvidia.com/default/topic/1042512/cuda-programming-and-performance/andaccumulate-kernel-wrong-output/#]https://devtalk.nvidia.com/default/topic/1042512/cuda-programming-and-performance/andaccumulate-kernel-wrong-output/#[/url]