How to calculate the number of warps to hide latency

my gou is GM107(maxwell)
for examplem the menory latency is 400 cycles and how to calculate the number of warps need to hidew the latency?

64 warps * number of SMs

How to calculate the 64 warps, could you explain to me?
and 64 warps means 2048 threads? is it too much?

64 warps = 2048 threads

look up the cuda max threads per multiprocessor spec