nvcc -Xptxas doesn't seem to work ?

Hello,

I tried pretty much everything with nvcc and -Xptxas

It doesn’t seem to work ?

I tried specifieing:

-Xptxas=v
-Xptxas=-v
-Xptxas v
-Xptxas -v
“-Xptxas v”
etc…

What is verbose supposed to output ? Can somebody give example to see if it’s working or not… (I think I had it working once but now it’s not working perhaps I called ptxas directly).

I also tried to change the optimization level for -Xptxas o2

But the ptx file still says O3 ?!? So apperently it’s not working ?

I don’t like the way the -Xptxas works (or doesn’t work in this case).

Is there perhaps a way to “de-chain” the toolchain so that I can do steps manually ?

So what kind of file would I have to feed to ptxas and how would I produce such a fail ?

Thanks for any help, thoughts or suggestions or experiences,
Bye,
Skybuck

-Xptxas -v works just fine here (Windows and Linux). You can see how the individual components are invoked by passing -v as part of the nvcc invocation. Note that the correct syntax for controlling PTXAS optimization is -Xptxas -O{0|1|2|3}. I would advise against use of component-specific control flags in production builds.

When I use the command line I see this:

M:>nvcc -ptx --machine 32 “S:\Delphi\Tests\test cuda random memory access performance\version 0.08 experiment some more with it\Release\Win32\CudaMemoryTest.cu” -Xptxas -v
CudaMemoryTest.cu
CudaMemoryTest.cu
tmpxft_00001134_00000000-3_CudaMemoryTest.cudafe1.gpu
tmpxft_00001134_00000000-10_CudaMemoryTest.cudafe2.gpu

That’s it ?!? Is that normal ?!? I was expecting more input ?

Maybe the environment paths are overloaded with entries ?!?

Some websites say: “study output carefully with -Xptxas -v” but what do they mean with that ?

There is very little output in above example ?!?

Err if you are compiling to PTX, the assembler is never called and -Xptxas will have no effect.

Ok thanks I got a bit confused about that.

There are apperently multiple ways to get things done… either compile *.cu to *.cubin directly via nvcc.exe or indirectly via ptxas.exe (compiles *.ptx to *.cubin) and can show register usage.

Also cuobjdump can be used to dump *.cubin to some kind of internel/gpu/final assembler/instructions (?).

Only thing missing for me in cuobjdump is an option to save it to a file, that would be handy ;)

ptxas also needs to be explicitly told to save to file otherwise it will just show registers if -v is specified ;)