yes, it indeed said what you think about
The maximum sizes of the x, y, and zdimension of a thread block are 512, 512, and 64, “respectively”
in fact, some conditions in spec of compute capability must be considered together,
I can show you another
from spec of compute capability 1.0

The maximum number of active blocks per multiprocessor is 8

The maximum number of active warps per multiprocessor is 24

The maximum number of active threads per multiprocessor is 768
if you want to ask me “what is maximum number of active blocks in a multiprocessors”, then
you must combine these three conditions. however condition 2 is redudant, because from
condition 3 and 32 threads/warp, you have 768/32 = 24 warps
hence we only consider condition 1 and condition 3
example 1: suppose we choose size of threads block as 16 x 16 = 256, then
under condition 3, we have 768/256 = 3 blocks in a multiprocessor
under condition 1, 3 < 8, hence wew have 3 blocks in a multiprocessor
example 2: suppose we choose size of threads block as 4 x 4 = 16, then
under condition 3, we have 768/16 = 48 blocks in a multiprocessor
under condition 1, 48 > 8, hence wew only have 8 blocks in a multiprocessor
this means that we only have 8 * 16 = 128 active threads in a multiprocessor