From what I red in the NVIDIA’s white paper, the next-gen GPU architecture is going to increase the number of cores per SM 4x (to 32), but is going to allow only 48kb of on-chip cash to be configured as shared memory. That is, shared memory per core goes down ? That is somewhat bad news, as some applications’ performance is limited by exactly this factor, limiting the number of warps dispatched to the given SM. Any thought on that?
P.S. Sorry for the typo in the title.