I was directed to the Developer Zone by a person at another forum, but I’m not sure this is the most appropriate board. Let me know if this would be better posted e.g. on the OpenGL board. The problem doesn’t seem to be rooted in OpenGL per se, and maybe not even in Linux per se, but is associated with both and has only started happening on a recent family of cards.
NVIDIA driver 310.44
We have some components that render simple gridded terrain data in OpenGL using VBOs. The terrain is broken up into tiles that are either 201x201 points or 601x601 points per tile, so not tiny but not exactly massive either.
We do what you might expect with this data - preprocess it into VBO-friendly form that can be loaded from disk and uploaded to OpenGL with minimal interim processing.
The current format is three floats for geometry plus three bytes for lighting normals per vertex. This results in a possibly awkward 15-byte stride. Per-vertex data are unique, and shared vertices are handled through indexes.
The tiles are then rendered as quad strips using glMultiDrawElements with different viewing parameters and shading/coloring schemes, sometimes with a shader in the loop, sometimes not.
We’ve never bothered trying to break any of the tile data up into smaller VBOs, and have always uploaded the full 201x201 or 601x601 set of vertex data per tile without resorting to any kind of compression scheme and/or reduced precision data types for the geometry data.
One of the terrain views is a “synthetic terrain” grid presentation, which is a wire mesh with fill. The other is a top-down map with a shader in the loop to provide some interesting color banding. As you might expect, the wire mesh currently is rendered in two passes, one in polygon outline mode then a second pass in fill mode. The top-down map is pure fill mode and is the only view that uses the color banding shader.
Both views have worked very well to date on 9800GT, 560M, and 580 cards.
We recently discovered that on 600-series cards - specifically 640, 670, and 680 - the drawing performance on the outline mode for the wire mesh degrades significantly. I don’t have specific numbers in hand at the moment, but I’d estimate that the apparent update rate on screen drops from 30 Hz to 10 Hz or even less.
While this is going on the user experience in X also really starts to drag. Mouse movement is no longer smooth for example, and the pointer jumps across the screen in spurts as you drag.
I’m able to pin this specifically on the wire mesh, which currently is being rendered as quad strips in polygon outline mode. If I turn off the wire mesh only, then throughput cranks back up to what it has been historically. In fact when the wire mesh is not in the loop I can upload hundreds of the 601x601 tiles for filled poly processing without so much as a hiccup, shader and all. Conversely the wire mesh in the loop by itself without the fill processing and without the top-down map, is enough to cause the performance degradation.
Granted I could render the wire mesh differently, say as a bunch of line strips, and maybe that would help (or maybe it wouldn’t), but at this point I’m wondering why moving forward to a newer card like the 680 seems to have taken a major leap backwards on line drawing throughput.
I’ve examined this code pretty closely and not found any errors. That, plus the fact that it has worked so well historically makes me less suspicious of the OpenGL calls. It seems more likely that perhaps something about the way the data are arranged is aggravating something on the NVIDIA side.
Or… maybe these cards just aren’t so good with rendering lines. Seems unlikely though.
Has anybody else observed this problem?
nvidia-bug-report.log.gz (63.4 KB)