Creating new threads increases execution time ?

consider i have 512 executions to be taken place , so if i chose 512 blocks with 1 thread each it executes faster compared to 1 block with 512 threads. Now if i have 99999999 executions to take place what is the best permuation of blocks and threads for optimum result ?

Regards
AJ

Your question can not find a correctly answer. It is depend on your program, your algorithm, etc.