Printing the number of registers used or occupancy

I am trying to print the number of registers I am using in a certain program. I also would like to have it print the occupancy rate. If anyone can help me with either of those problems, I would greatly appreciate it.

I read the CUDA guide and could not find this anywhere, so even another link would be helpful.

Check the stick post at the top of this forum:

I was looking more for something that I could use at compiletime or runtime. I am attempting to run a certain test using a script and would like to be able to have the program output the number of registers it uses in each run of the program.

Oh. I’m not aware of anything that does everything you want. You’ll have to roll your own custom scripts in the build system.

FindCUDA.cmake for the CMake build system has code that compiles the cubin and parses the number of registers from it to print to the screen. With a some modification, you could set it to write to a header file as #DEFINES or something and then include that header file in your code which can implement the occupancy calculator formulas to print occupancy at runtime.

The same is possible in any build system with enough scripting.

Although, this method still isn’t 100% foolproof as the actual register usage on the device sometimes varies +/- 1 from the value printed in the cubin.