It is known that (in non MPS mode) when processes share the GPU, their scheduling resources must be swapped on and off the GPU. ← From MPS documentation
Is there a way to query the cuda driver/runtime or probably using CUPTI to know when the switch between cuda contexts is happening?
Thank you,
Sujan