i can easily export CUDA and PTX when using nvrtc (i.e. the inputs and outputs of the nvrtc compilation), but i can’t seem to figure out a nice way to get the .cubin and/or view the disassembly for nvrtc-compiled functions after loading them with cuModuleLoadDataEx().
i can see the SASS using nvvp, but that seems a bit slow/cumbersome. i’d guess maybe the debugger could work similarly, but i didn’t try it.
i can also use ptxas on the .ptx emitted from nvrtc:
ptxas out.ptx -arch sm_52 -o out.cubin ; nvdisasm out.cubin
this seems like a decent approach, however, i worry that this might not consistently yield the same code as when i use cuModuleLoadDataEx()
further, it’s perhaps somewhat against the spirit of using nvrtc; otherwise i could be os.system()'ing nvcc in the first place :/ … although admittedly i don’t really see a non-devel/debugging use-case for needing to alter/inspect the .cubin/dissassembly at run-time.
anyone have any clues on other official/unofficial methods for getting access to .cubin after calling cuModuleLoadDataEx(), or other general flows for this sort of debugging/inspection of SASS?