I have been reading about CUDA and specially about GT200. However, it is very difficult to find documentation about specific details.
I read a paper written in Toronto (http://www.eecg.toronto.edu/~moshovos/CUDA08/arx/microbenchmark_report.pdf) that claims GT200 has a 2K, a 8K and a 32K caches. My first question is if the NVIDIA documentation supports this, and if so, where can I find it.
As far as I know, GT200 only has a cache of 8Kb per Multiprocessor, and a 64Kb space in Device memory to assign Constant Memory from the host.
Either way I want to know what is the latency in cycles of the 64Kb constant Memory and the latency in cycles of the 8Kb constant cache memory (Or the latency of the 2Kb, 8Kb and 32Kb if that is true).
I hope you can help me.