Support multiple compute capabilities

The easiest way is to simply compile your PTX to the lowest supported streaming multiprocessor target which is SM 5.0 in OptiX 6 and newer.

See these threads.
https://forums.developer.nvidia.com/t/optix-6-support-for-sm-75-rtx2060/120940/2
https://forums.developer.nvidia.com/t/assertion-failed-acp-isusedassinglesemantictype/75364/2

You could also have multiple versions of your PTX code generated for the individual SM targets and check the SM version of the CUDA device and load the matching PTX input code, but I wouldn’t expect any dramatic performance changes from that
There actually have been cases where the OptiX PTX parser was not handling the newest SM versions.

OptiX parses your input PTX and recompiles it anyway and outputs intermediate code with the GPU’s SM target version.
Then the PTX assembler and SASS code generator inside the CUDA driver optimize that again to the final microcode.

The CUDA toolkit and display drivers should have more impact. For example CUDA 8.0 generated a lot better code than CUDA 7.5. And since the PTX assembler and microcode generator also ship with the display driver, as well as the OptiX core implementation and ray tracing drivers, you’d benefit from all optimizations in newer drivers automatically.