NVIDIA Turing Architecture In-Depth

Originally published at: https://developer.nvidia.com/blog/nvidia-turing-architecture-in-depth/

Fueled by the ongoing growth of the gaming market and its insatiable demand for better 3D graphics, NVIDIA® has evolved the GPU into the world’s leading parallel processing engine for many computationally-intensive applications. In addition to rendering highly realistic and immersive 3D games, NVIDIA GPUs also accelerate content creation workflows, high performance computing (HPC) and…

When you count "CUDA cores" in new GPUs, it's funny that you include only FP32 cores there. I think that you are better say about count of FP32 cores (even if previous gpus had fp/int cores)

Regarding DLSS: in the Turing Whitepaper DLSS is described as:

"... allows faster rendering at a lower input sample count..."

But in the text in this blog entry, DLSS is described as:

"... allows faster rendering at a lower input resolution..."

The two aren't equivalent, so which one is correct? The whitepaper or this blog entry?

Thanks for spotting the discrepency. The text in the white paper is the correct one; I've updated the blog post to reflect that.

Here is what I want to know quite badly. The RT core is really interesting to me, I think it I could access it at a lower level than the high level RT functions but use the underlying ISA like was done to create the pipeline functions, I could do some amazing things with it. Given what I think the functionality might be based on discussions and documents. So the question is, what level of underlying functionality of the ASIC will we have access to? I am seeing high level API's but nothing PTX or Assembly, are there low level possibilities to use that silicon outside of ray tracing but more computationally like possibly spacial and such?

Thanks for correcting the blog post so it's consistent with the whitepaper.

There seems to be a lot of conflation between DLSS and AI-UpRes/SuperRes on social media, a lot of people seem to consider DLSS as rendering at a lower resolution than the game settings and then upscaling this to the target resolution set by the game. Is this how DLSS works, ie. rendering at a lower resolution?

The links to Facebook, Twitter, email, etc. form a vertical bar on the left bottom of the screen which obscures the text that is under it leaving only a small window at the top left where the whole lines of the article appear. This inhibited me from reading it.

Hi, Don:

What's your screen resolution? Also, if you shrink the browser window (drag the right side of the window to the left), the icons disappear completely. What browser are you running?

If you're on mobile, the icons should only be at the very bottom.

Waterfox on Win 7 laptop. Screen is 1366 x 768 and the browser window is about 90% in each direction. Will try shrinking it.

Yeah, shrinking the window from the right side made the icons disappear. Thanks.

Can someone tell me what is Turing's Peak FP64 performance? I can't find this info in the whitepaper as usual (the comparison-with-previous-gen table to be specific).

It's mentioned in a footnote under the full Turing block diagram as 1/32 the FP32 rate.

Hi , would be great if i can get a clear answer regarding the memory and resource pooling in RTX 2080ti/2080 with NVLINK. Are they implemented to be functioning the same as their Quadro counterpart or are they non existent (essentially just an SLI bridge with x times the amount of speed) ?