Introduction to Turing Mesh Shaders

Originally published at: https://developer.nvidia.com/blog/introduction-turing-mesh-shaders/

The Turing architecture introduces a new programmable geometric shading pipeline through the use of mesh shaders. The new shaders bring the compute programming model to the graphics pipeline as threads are used cooperatively to generate compact meshes (meshlets) directly on the chip for consumption by the rasterizer. Applications and games dealing with high-geometric complexity benefit…

"Optimizing the vertex locations along is also beneficial,"
I assume the word "along" is meant to be "alone".

"Vertex cache optimizers that help classic rendering also help improving meshlet packing efficiency."
I assume should be either:
"Vertex cache optimizers that help classic rendering also helps, improving meshlet packing efficiency."
or
"Vertex cache optimizers that help classic rendering also help improve meshlet packing efficiency."

Overall an interesting and informative read, thanks.

Good read!
To make sure I understand how work can be done once and re-used, is it just that you'd have a run-once mesh shader that does work and writes the meshlet data to a buffer, then on subsequent frames, you'd get the processed data from that buffer instead of doing the work again?

I don't think it maintains the memory from frame to frame. It sounds like the memory is maintained for each of the task and mesh threads per dispatch. What this means is that when you dispatch it a bunch of task shader instances are spawned and the only input is the id for each instance. The task shader then outputs how many mesh shader instances to spawn. Each of these mesh shader instances have a single input of an id. My guess is that these ids are used to figure out what "piece" or "meshlet" or "submesh" of the larger mesh needs to be processed. Once a mesh shader instance is done it writes out the indices and vertices that then passed to the rasterizer which then launches the pixel shader. After this I believe the memory is reused by the next dispatch call. I'm looking forward to a sample application that will hopefully explain this better.

Hi, thanks for this! Any chance to get our hands on some code? Maybe you could put the code for this asteroids sample on github? Many thanks!