Maximum threads per block is 512 It may be arranged in any ways , as one dimensional , in this case maximum 512 , or two dimensional , then BxBy=512 is maximum , or three dimensional , BxBy*Bz=512 is maximum . Here maximum threads per dimension means , if we have a one dimensional block , it can have 512 elements and can be in Xdrection or Ydirecton . But if we have 3 dimensional block , the 3rd dimention should not exceed 64 , and obviously , the other dimensions are determined by the constraint of maximimum 512 .
So in principle , we can have 6553565535512 threads at a time .
I think you are considering only one dimensional blocks . So N=65535, threadsperblock=512 are the maxima.