Introduction to Turing Mesh Shaders

jwitsoe · September 17, 2018, 1:06pm

Originally published at: Introduction to Turing Mesh Shaders | NVIDIA Technical Blog

The Turing architecture introduces a new programmable geometric shading pipeline through the use of mesh shaders. The new shaders bring the compute programming model to the graphics pipeline as threads are used cooperatively to generate compact meshes (meshlets) directly on the chip for consumption by the rasterizer. Applications and games dealing with high-geometric complexity benefit…

anon68998863 · September 17, 2018, 2:53pm

"Optimizing the vertex locations along is also beneficial,"
I assume the word "along" is meant to be "alone".

"Vertex cache optimizers that help classic rendering also help improving meshlet packing efficiency."
I assume should be either:
"Vertex cache optimizers that help classic rendering also helps, improving meshlet packing efficiency."
or
"Vertex cache optimizers that help classic rendering also help improve meshlet packing efficiency."

Overall an interesting and informative read, thanks.

anon77038207 · September 21, 2018, 10:03pm

Good read!
To make sure I understand how work can be done once and re-used, is it just that you'd have a run-once mesh shader that does work and writes the meshlet data to a buffer, then on subsequent frames, you'd get the processed data from that buffer instead of doing the work again?

anon53815836 · October 2, 2018, 2:50pm

I don't think it maintains the memory from frame to frame. It sounds like the memory is maintained for each of the task and mesh threads per dispatch. What this means is that when you dispatch it a bunch of task shader instances are spawned and the only input is the id for each instance. The task shader then outputs how many mesh shader instances to spawn. Each of these mesh shader instances have a single input of an id. My guess is that these ids are used to figure out what "piece" or "meshlet" or "submesh" of the larger mesh needs to be processed. Once a mesh shader instance is done it writes out the indices and vertices that then passed to the rasterizer which then launches the pixel shader. After this I believe the memory is reused by the next dispatch call. I'm looking forward to a sample application that will hopefully explain this better.

anon44736628 · November 7, 2018, 4:39pm

Hi, thanks for this! Any chance to get our hands on some code? Maybe you could put the code for this asteroids sample on github? Many thanks!

Topic		Replies	Views
NVIDIA Turing Architecture In-Depth Technical Blog	12	804	September 25, 2018
Best Practices: Using NVIDIA RTX Ray Tracing Technical Blog	0	533	August 25, 2020
Using Mesh Shaders for Professional Graphics Technical Blog	3	892	March 8, 2021
Using Turing Mesh Shaders: NVIDIA Asteroids Demo Technical Blog	11	1172	August 21, 2023
Best Practices for Using NVIDIA RTX Ray Tracing (Updated) Technical Blog	0	386	July 25, 2022
Turing Texture Space Shading Technical Blog	2	647	November 7, 2018
Reading Between The Threads: Shader Intrinsics Technical Blog	0	313	November 21, 2023
Exploring NVIDIA TensorRT Engines with TREx Technical Blog	0	445	June 16, 2022
Generating Ray-Traced Caustic Effects in Unreal Engine 4, Part 1 Technical Blog	12	2143	December 22, 2024
Using Nsight Compute to Inspect your Kernels Technical Blog	2	1660	August 31, 2020

Introduction to Turing Mesh Shaders

Related topics