Bug in CUDA samples

I was recently going through the reduction sample and came across this function:

////////////////////////////////////////////////////////////////////////////////
// Compute the number of threads and blocks to use for the reduction
// We set threads / block to the minimum of maxThreads and n/2.
////////////////////////////////////////////////////////////////////////////////
void getNumBlocksAndThreads(int n, int maxBlocks, int maxThreads, int &blocks, int &threads)
{
    if (n == 1)
    {
        threads = 1;
        blocks = 1;
    }
    else
    {
        threads = (n < maxThreads*2) ? nextPow2(n / 2) : maxThreads;
        blocks = max(1, n / (threads * 2));
    }

    blocks = min(maxBlocks, blocks);
}

Isn’t the number of blocks computed too less? Shouldn’t the number of blocks be (1 + n / threads)?

The reduction sample code has been updated recently. Please use and refer to the version in CUDA 9.1 or higher.