Optimal vertex and index layout order on modern GPU's?

BrianSharpe · November 2, 2012, 5:12am

( NOTE: Initially posted this on the openGL forum. But later thought it better here. sorry!? )

Hi there.

I’m coding for modern GPU’s. ( OpenGL 4.x etc… )

The mesh data I’m sending to the graphics card is pretty much directly output from Maya. I’m guessing they will have poor vertex-cache-ordering.
I’m hoping to use a vertex-cache-optimization pre-pass on the meshes to gain some performance. As suggested here
http://home.comcast.net/~tom_forsyth/papers/fast_vert_cache_opt.html

The meshes have 500000+ triangles in them. ( Rendered as triangles via index buffer )
And we do a number of shadow passes too. Which puts more requirements on vertex through-put.

The thing is…
I have tried to use TomF’s algorithm (as described in the link) and it actually made things slower! :( And I definitely performed all steps.
ie

index buffer re-ordering
rebuild vertex buffers using the new index ordering to achieve near-linear access

Any idea why this would be? I’m guessing the assumptions that Tom made back in 2006 do not hold for modern GPUs?

NOTE:
I used both of these implementations

And both achieved the same slowdown. ( from 31fps to 29fps )
So I’m guessing both have a consistent ( and therefore hopefully correct ) implementation.

My Question:
How should I approach this problem given modern-day architectures?
How should I be ordering the data to achieve the best performance on the card?
Is it worth doing anything at all?

Thanks a lot! :)
Brian

BrianSharpe · November 5, 2012, 5:24am

NOTE:
We’ve now sorted this. Information here
=> Optimal vertex and index layout order on modern GPU's? - OpenGL - NVIDIA Developer Forums

Topic		Replies	Views
Optimal vertex and index layout order on modern GPU's? OpenGL	4	4866	November 5, 2012
How to use cuda for random access to a vertex table CUDA Programming and Performance	1	5750	March 9, 2010
Urgent: Cache optimization for spring-mass system CUDA Programming and Performance	1	11125	March 10, 2011
Coalesced Triangle Access CUDA Programming and Performance	2	688	April 1, 2013
N-dimensional array reordering Strange errors for large array reordering on GPUs CUDA Programming and Performance	4	823	February 11, 2012
Creating GAS on the fly OptiX	3	861	June 14, 2022
Optimize kernel CUDA Programming and Performance	0	2480	September 4, 2010
OptiX 6 - Vertex position OptiX	2	736	October 12, 2021
What is the execution order of cuda blocks? CUDA Programming and Performance	3	206	July 17, 2024
DirectX->Optix single geometry buffer or multiple? OptiX	4	2470	June 14, 2022

Optimal vertex and index layout order on modern GPU's?

Related topics