GeForce GPUs in the throughput table

In Table 3 of the programming guide

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#maximize-instruction-throughput

there are some footnotes saying that “xxx for GeForce GPUs”. What does GeForce GPUs mean here? Does it include almost everything of Desktop and Mobile GPUs from the list of NVIDIA GPUs?

https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units

A common sense interpretation would be that this refers to any GPU whose name contains “GeForce”, NVIDIA’s brand for consumer-level GPUs.

As you can see from the numeric data, in the footnoted cases consumer-level GPUs of a particular architecture have been configured for lower throughput than other GPUs of the same architecture, presumably driven by a desire for market differentiation relative to the Quadro and Tesla brands.

If so, according to the table, the throughput of fp64 operations on cc3.5 GeForce devices is only 8. However, for TITAN, the fp64 throughput is 1/3 of fp32 throughput, which should be 64. Is there an error in the table or do I have any misunderstandings?

I suspect these tables are not designed to stay in sync with every last detail of NVIDIA’s market differentiation schemes. The Titan line is sometimes referred to as “prosumer” products, taking some sort of middle ground between more gaming-oriented consumer cards and professional cards.

If 100% percent accuracy is important to you, I would suggest measuring the metrics relevant to your use case. If you like, you can also file a bug against CUDA documentation using the regular bug reporting mechanism (see pinned message at the top of this forum), requesting clarification.

Thank you for your clarification.