I tried pretty much everything with nvcc and -Xptxas
It doesn’t seem to work ?
I tried specifieing:
-Xptxas=v
-Xptxas=-v
-Xptxas v
-Xptxas -v
“-Xptxas v”
etc…
What is verbose supposed to output ? Can somebody give example to see if it’s working or not… (I think I had it working once but now it’s not working perhaps I called ptxas directly).
I also tried to change the optimization level for -Xptxas o2
But the ptx file still says O3 ?!? So apperently it’s not working ?
I don’t like the way the -Xptxas works (or doesn’t work in this case).
Is there perhaps a way to “de-chain” the toolchain so that I can do steps manually ?
So what kind of file would I have to feed to ptxas and how would I produce such a fail ?
Thanks for any help, thoughts or suggestions or experiences,
Bye,
Skybuck
-Xptxas -v works just fine here (Windows and Linux). You can see how the individual components are invoked by passing -v as part of the nvcc invocation. Note that the correct syntax for controlling PTXAS optimization is -Xptxas -O{0|1|2|3}. I would advise against use of component-specific control flags in production builds.
M:>nvcc -ptx --machine 32 “S:\Delphi\Tests\test cuda random memory access performance\version 0.08 experiment some more with it\Release\Win32\CudaMemoryTest.cu” -Xptxas -v
CudaMemoryTest.cu
CudaMemoryTest.cu
tmpxft_00001134_00000000-3_CudaMemoryTest.cudafe1.gpu
tmpxft_00001134_00000000-10_CudaMemoryTest.cudafe2.gpu
That’s it ?!? Is that normal ?!? I was expecting more input ?
Maybe the environment paths are overloaded with entries ?!?
There are apperently multiple ways to get things done… either compile *.cu to *.cubin directly via nvcc.exe or indirectly via ptxas.exe (compiles *.ptx to *.cubin) and can show register usage.
Also cuobjdump can be used to dump *.cubin to some kind of internel/gpu/final assembler/instructions (?).
Only thing missing for me in cuobjdump is an option to save it to a file, that would be handy External Image
ptxas also needs to be explicitly told to save to file otherwise it will just show registers if -v is specified External Image