widgg
May 5, 2018, 1:01am
1
I just wonder why it’s possible to get the code from certain kernel in PTX, when some of them are only available in SASS!
Is there a way to go from SASS to PTX, or a different set of parameters that I can use with cuobjdump to get the PTX code instead?
Or it’s not possible because of the way those kernels were compiled!
Thanks
It depends on how it was compiled.
Depending on the exact combination of arch/gencode flags passed during compilation, a binary may contain one or more instances of SASS, one or more instances of PTX, or both.
[url]nvcc - CUDA: How to use -arch and -code and SM vs COMPUTE - Stack Overflow
[url]NVCC :: CUDA Toolkit Documentation
There are no cuobjdump parameters you can use to get PTX from the binary if there is no PTX embedded.
There are no cuobjdump parameters you can use to get SASS from the binary if there is no SASS embedded.
widgg
May 6, 2018, 2:16am
3
Thanks you very much for your reply!
Plus the very informative links.
And I guess that when the PTX is not there, it must produce smaller compiled files also. Very useful for the release version.