Compiler option --ptxas-options=-v gives wrong register count?

Duff · July 14, 2010, 9:38pm

Hello, I was trying to compare the resource allocation of two kernels using the --ptxas-options=-v compiler option and got some strange results. Kernel 1 is much larger than kernel 2 and yet only shows as using 1 more register. I was suspicous of this so dumped out the ptx file, and things look very different.

–ptxas-options=-v -arch=sm_20 -O2: kernel 1 = 21 registers, 136 bytes cmem - kernel 2 = 20 registers, 128 bytes cmem

ptx file register allocations

kernel 1:
.reg .u32 %r<49>;
.reg .u64 %rd<56>;
.reg .f32 %f<81>;
.reg .f64 %fd<94>;
.reg .pred %p<8>;

kernel 2:
.reg .u32 %r<38>;
.reg .u64 %rd<42>;
.reg .f32 %f<62>;
.reg .f64 %fd<13>;
.reg .pred %p<5>;

I am no expert on this, but but looking at the ptx white paper I assume that this means kernel 1 is allocating a total of 288 ‘virtual’ registers and kernel 2 - 160? Are the 64 bit data actually using 2 registers each since cuda only has 32 bit registers? Is ptxas already accounting for a register spill? Any discussion on this topic would be most helpful. This is on a GTX 480 card.

tera · July 14, 2010, 9:45pm

Unlike the name suggests, ptxas is a full compiler that does it’s own register allocation and several optimizations on the PTX when compiling it into .cubin files. The register count given by ptxas is for the .cubin files, i.e. after the optimization.

To check those register numbers, you would have to run the .cubin file through decuda.

Duff · July 15, 2010, 1:53pm

Thank You, that makes much more sense.

Duff · July 15, 2010, 1:58pm

Thank You, that makes much more sense.

Topic		Replies	Views
CUDA low-level programming - strange ptxas behavior CUDA Programming and Performance	4	1577	February 17, 2014
ptxas optimization CUDA Programming and Performance	4	3027	January 9, 2009
PTX assembler output is incorrect ? CUDA Programming and Performance	2	2730	May 7, 2008
Counting register number in PTX CUDA Programming and Performance	5	1918	June 27, 2011
ptxas info , why so many lines? CUDA Programming and Performance	4	2149	April 30, 2014
HOW IS REGISTERS-PER-THREAD CALCULATED? CUDA Programming and Performance	17	10240	September 5, 2008
ptxas register use CUDA Programming and Performance	5	1945	March 4, 2014
ptx question CUDA Programming and Performance	4	3917	October 16, 2008
`maxrregcount` silently ignored by `nvcc` and `ptxas` CUDA Programming and Performance	19	784	October 21, 2024
Why would recycling registers increase register count? CUDA Programming and Performance	1	623	September 10, 2018

Compiler option --ptxas-options=-v gives wrong register count?

Related topics