What is the reason that the dimension of thread organization also has an impact on computational efficiency?

How does the dimension of thread division affect the acceleration effect? When the number of threads in the block is the same, I organize the thread into one dimension faster than three-dimensional. I only know the influence of the number. The situation on this dimension is not clear. I want to understand the reason.