Sample reduction kernel 6 fails for threads=1024 maxblocks=32 n=33554432

If the failure of the reduction sample invoked as

./reduction threads=1024 maxblocks=32 n=33554432 kernel=6

is to be expected, it would help if the sample exited earlier without even trying to perform the reduction. Otherwise, what’s the bug?

$ ./reduction threads=1024 maxblocks=32 n=33554432 kernel=6

./reduction Starting...
GPU Device 0: "Quadro P5000" with compute capability 6.1
Using Device 0: Quadro P5000
Reducing array of type int
33554432 elements
1024 threads (max)
32 blocks

Reduction, Throughput = 172.9497 GB/s, Time = 0.00078 s, Size = 33554432 Elements, NumDevsUsed = 1, Workgroup = 1024

GPU result = -9312
CPU result = -16317892

Test failed!