JIT process is covered in the nvcc documentation:
[url]http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#just-in-time-compilation[/url]
I acknowledge that that does not provide a detailed answer to “By what mechanism does the driver compile the PTX into SASS then?”
The best I can offer is that there is a tool/codepath in the driver that detects at runtime that a particular GPU binary contains PTX but not SASS, and follows a codepath similar to what is contained in the ptxas tool to generate suitable SASS on the fly.
If you require a more detailed description, I don’t have it and don’t know where to find it.
A few other comments not directly related to your question:
JIT compilation involves conversion of PTX into SASS, and it is always handled by the GPU driver.
Based on that statement, then, we can assume that different drivers may produce different SASS, for otherwise identical inputs, in a JIT-compilation scenario.
When we compare JIT-produced SASS vs. SASS produced by nvcc (as a result of specification of the target architecture for code generation), the SASS produced by nvcc uses a separate tool called ptxas (you can find ptxas on your machine - it is a separate compiler/assembler in /usr/local/cuda/bin on an ordinary linux install), whereas the SASS produced by JIT compilation is generated by the driver itself (i.e. not by ptxas - ptxas is installed by the CUDA toolkit installer, and it is not necessary to be present on a machine to support JIT compilation - only the driver is required).
Therefore, since the generating code is the driver in one case (via a codepath like ptxas embedded in the driver), and explicitly by the installed ptxas tool in the other case, we can assume that the generated SASS may be different, and, again, there are no stated claims in the CUDA documentation otherwise.