Is it possible to extract the SASS instructions of an OptiX Kernel? So far, I can compile the SDK examples to PTX and see the function calls to the driver code(_optix_trace_typed_32). I’m curious to know if I can see the hardware level instructions of the kernels.
In addition to @lspano’s cuobjdump suggestion, be aware that another option is to use Nsight Compute, which shows you SASS of OptiX kernels, along with per-instruction profiling metrics, as well as source line profiling metrics if you’ve used debug info & symbols.
Yes, you would have to compile to object file which is unusable by optix but can be fed into objdump to see the associated sass. I prefer to use NSight Compute as mentioned by David above. Just compile with --lineinfo and inspect your first raygen kernel launch. This has the added benefit of showing correspondence of SASS code to input CUDA (ie, you can click on a cuda statement and see the corresponding SASS and vice versa) and you can see performance profiling info as well.
Okay – I can see the SASS instructions on Nsight Compute. I was expecting to see ray tracing instructions, but I do not see any (not sure if they exist in the first place). The raygen shader is just a sequence of arithmetic operations. I’m running the optixTriangle from the SDK on a mobile RTX3060 GPU, so there are RT cores in it. Is OptiX not using the RT cores somehow? Is it “emulating” ray tracing on CUDA cores?
OptiX is using the RT Cores on RTX-enabled GPUs. The SASS you can see is based on whatever code you provide, e.g., your shader programs. The driver instructions are proprietary, and in addition the tools are designed to help you inspect and profile the parts of the system that you have control over. Note this doesn’t mean you can’t account for the time spent outside your code.