Greetings!
I’m interested in the number of registers per thread used by one of the kernels. I tried different methods but received different results: via ptxas I got 5, but from the CUDA visual profiler I got 13.
I’m using the latest CUDA software, Ubuntu 10.10 and a GTX 460 card.
Could anybody explain this?