I ran the deviceQuery program on my RHEL 4 32-bit linux box. I get the following result:
There is 1 device supporting CUDA
Device 0: “Quadro FX 4600”
…
…
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
…
…
The above two lines seem to be contradictory. My impression was the block dimensions should not be more than 512, since the number of threads is 512. Am I correct ?
Why are they contradictory? 512x1 = 512 so the maximum num threads in each dimension is 512. You can flexibly choose any block config as long as the sum of the threads is <512.
What is a bit boring however is that the max config isn’t 512^3. Discrimination of the 3rd dimension External Media … the world is (almost) a disc External Media