generating interleaved ptx and sass

Hello everyone!
I started learning ptx and I am having a blast! I manually coded in ptx a simple copy kernel, the handcrafted one and the one generated by nvcc, generates somewhat different sass, the handcrafted one actually seems to generate more instruction, although runtime timings are within margin of error. I did not expect to see any major difference, now, It would be super useful to see if I can disassemble and having the corresponding ptx interleaved might help me understand better what is going on there.
I found the following link:
Which is great, although I am currently working on centos using clion/cmake project I can bring it to windows but I wonder if I can achieve the same using the given utilities like cuobjdump etc.
So far I am able to obtain, ptx with interleaved source code and sass code. Any idea?


Unfortunately, I don’t think this is possible with just cuobjdump and similar utilities. However, you should be able to manually inspect the SASS code and see where it relates to in the PTX code. This might be a bit hard because ptxas is an optimizing compiler and will therefore reorder instructions, eliminate dead code, and perform various other optimizations. Still, it should be doable. If you have any SASS-specific questions, feel free to ask me because I’ve spent a lot of time digging around SASS code.