Why with cuobjdump, some kernels only have SASS and no PTX

I just wonder why it’s possible to get the code from certain kernel in PTX, when some of them are only available in SASS!

Is there a way to go from SASS to PTX, or a different set of parameters that I can use with cuobjdump to get the PTX code instead?

Or it’s not possible because of the way those kernels were compiled!

Thanks

It depends on how it was compiled.

Depending on the exact combination of arch/gencode flags passed during compilation, a binary may contain one or more instances of SASS, one or more instances of PTX, or both.

https://stackoverflow.com/questions/35656294/cuda-how-to-use-arch-and-code-and-sm-vs-compute

https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#gpu-compilation

There are no cuobjdump parameters you can use to get PTX from the binary if there is no PTX embedded.

There are no cuobjdump parameters you can use to get SASS from the binary if there is no SASS embedded.

Thanks you very much for your reply!

Plus the very informative links.

And I guess that when the PTX is not there, it must produce smaller compiled files also. Very useful for the release version.