Nvidia Tesla P100

Jen-Hsun announced the amazing Tesla P100 GPU at GTC 2016.

http://www.nvidia.com/object/tesla-p100.html

Will it also have improved “random access memory transactions per second” ?

So what’s the over-under that the consumer Pascal multiprocessors will have such massive register files?

I wouldn’t be surprised if the GP10x consumer variants have 32K regs per 64 core SMP.

Hopefully I’m wrong!

There is a new whitepaper: NVIDIA GP100 Pascal Architecture – Infinite Compute for Infinite Opportunities

I wonder whether the writer of that whitepaper is aware of the meaning of “infinite”, or whether Pascal is in fact the solution for all NP hard problems out there.

I’m pretty not happy about the decrease in shared memory size per SM. But I guess the effective size doubles in fp16 now so maybe it’s not a big deal. Pretty sure I’ll mostly only care about lower precision performance from now on.

I’d also rather have fewer SMs with more schedulers/cores. I generally have no problem filling an SM with warps, but often times it’s the total number of available blocks that comes up short. But I guess I could be looking harder for independent work and leveraging streams more.

It would nice to know what the new L1 and instruction cache sizes are.

Scott, Have you had a chance to work with Pascal yet?

BTW I enjoyed your talk at GTC.

My only complaint was that they only gave you 25 minutes to speak while I think the audience would have been fine with a hour or more.

@njuffa, Infinite™®*!

@scottgray, for all the reasons you list, I’m hoping the consumer GP10x is closer to an sm_50’ish 128/4/64K/64KB FP32/FP64/REGS/SMEM.

Well I guess in an age where “unlimited” internet connectivity is in fact capped, “infinite” compute may have acquired a new meaning as well.

The reduced shared memory size jumped out to me as well, that seems to be asking for trouble, performance portability wise. I wonder whether it may be a consequence of the much higher SM core frequencies coupled with a desire to keep shared memory latency low?

From what I can tell Pascal-based consumer-level products are still quite some time off into the future, so nothing to worry about until then. I am particularly eager to see how much more compute they managed to squeeze into the lower-end GPUs without auxiliary power connectors. Will it really be 2x Maxwell?

External Media

March 27-30 2017 for GTC 2017, guessing the GP100 GeForce TITAN & Quadro P6000 will be launched at that time since the demand for Tesla P100 is off the charts.