New Pascal GPUs Accelerate Inference in the Data Center

Originally published at: https://developer.nvidia.com/blog/new-pascal-gpus-accelerate-inference-in-the-data-center/

Artificial intelligence is already more ubiquitous than many people realize. Applications of AI abound, many of them powered by complex deep neural networks trained on massive data using GPUs. These applications understand when you talk to them; they can answer questions; and they can help you find information in ways you couldn’t before. Pinterest image…

Your table shows that GP104/GP102 has 128 KB of shared memory per SM. Can you please confirm that it's not a mistake (Maxwell GPUs had only 64-96 KB)? Is that holds for all CC 6.1 devices including all gaming cards? Can you please say the size of L1/texture cache and how it's shared between warp shedulers? May be there is some Whitepaper describing the 6.1 architecture?

The http://docs.nvidia.com/cuda... shows that CC 6.1 has the same capabilities as 5.2 (and answers my question regarding caches), so it's just a typo here

Typo fixed, thanks. 96KB shared memory on P4 and P40 SMs.