According tot he deviceQuery for TitanV:
(80) Multiprocessors, ( 64) CUDA Cores/MP: 5120 CUDA Cores
L2 Cache Size: 4718592 bytes
I would like to know why L2 size is not a power of 2? Since it is shared among all SMs, each SM has 58982.4 bytes.
Or maybe that total number is the summation of somethings else and each SM has power of 2 slices.