Cost of accessing built-in variables

Hello everyone!

I am sorry if this is documented somewhere, but I did not find it:

Accessing global memory costs between 400 and 600 cycles, accessing shared memory costs 4 cycles, registers add no overhead to the computation cost. Where do the built-in variables like threadIdx, blockDim, etc. reside? What is the access cost?

Thanks!

Volker

It looks like this is discussed in a few other threads.

http://forums.nvidia.com/index.php?showtop…mp;#entry314099

http://forums.nvidia.com/index.php?showtopic=79049

http://forums.nvidia.com/index.php?showtopic=58249

They come from shared memory

Thank you!