I am unclear about the latency of Texture memory vs Constant memory accesses with CUDA. I am only interested in accesses when the cache is hit.
Section Â§126.96.36.199 of the CUDA Programming Guide, about Texture Memory, says:
The following document about the GTX280 http://www.networkmultimedia.org/Publications/practicals/beyer2009.pdf (chart at the top of PDF page 25) mentions a very small latency (~register latency) for Constant memory accesses when I hit the cache, and a latency of 100 cycles when hitting the Texture cache.
Is it yes or no much faster to access the Constant memory cache (~register latency) than it is to access the Texture Memory cache (sounds like this is 100 cycles)? Is it the same for G80 and Fermi boards?
Any help would be appreciated.