ptx question

_Moritz · October 14, 2008, 5:07pm

Hello!

I have a problem with a kernel. It uses too much registers. When I look in the ptx file I found the following:

mov.f32  %f1, 0f00000000;      // 0

	mov.f32  %f2, %f1;            	// 

	mov.f32  %f3, 0f00000000;      // 0

	mov.f32  %f4, %f3;            	// 

	mov.f32  %f5, 0f00000000;      // 0

	...

	mov.f32  %f59, 0f00000000;    	// 0

	mov.f32  %f60, %f59;          	// 

	mov.f32  %f61, 0f00000000;    	// 0

	mov.f32  %f62, %f61;          	// 

	mov.f32  %f63, 0f00000000;    	// 0

	mov.f32  %f64, %f63;          	//

The actual C code looks like this:

float c1[16] = {0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f};

	float c2[16] = {0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f};

Instead of using 32 registers for the two arrays, the compiler uses 64…It is interessting, that the even number registers are never used as a source. But whenever the odd ones are set, a copy of them is stored in the even ones…

Does anybody know that happened there?

Thanks!

Moritz

alex_dubinsky · October 14, 2008, 5:12pm

You’re looking at PTX. It’s not real assembly code. It will be re-optimized when compiled into a cubin. You might want to try continuing your investigation using decuda

_Moritz · October 14, 2008, 5:55pm

Thank you for the tip.

Seems like ptx has not much to do with the actual assembly.

Sarnath · October 16, 2008, 8:38am

I think PTX has got very much to do with assembly… But it would be further optimized by the cuda run-time system… So, I dont think even CUBIN has the actual binary output.

It all depends on the CUDA run-time system…

I might be wrong though… Just my inferences from nvcc -help description…(-code and -arch options)

alex_dubinsky · October 16, 2008, 11:46pm

As far as I know, cubin is the actual machine code.

Topic		Replies	Views
register usage according to the ptx file CUDA Programming and Performance	3	4335	June 26, 2009
CUDA low-level programming - strange ptxas behavior CUDA Programming and Performance	4	1577	February 17, 2014
ptxas optimization CUDA Programming and Performance	4	3029	January 9, 2009
Difference between the registers usage information showed in ptx file and cubin file CUDA Programming and Performance	4	1428	March 3, 2011
Compiler option --ptxas-options=-v gives wrong register count? CUDA Programming and Performance	3	3242	July 15, 2010
Register usage and .ptx files CUDA Programming and Performance	2	7067	October 12, 2007
Some questions about PTX and cubin. How to know where my registers are used for? CUDA Programming and Performance	3	16049	July 9, 2010
Optimizing ptx CUDA Programming and Performance	10	9277	April 24, 2008
Got some questions about ptx code ptx code, cubin file, decuda, register usage CUDA Programming and Performance	1	3195	July 8, 2010
ptxas register use CUDA Programming and Performance	5	1945	March 4, 2014

ptx question

Related topics