“GPU’s with Hyper-Q have a concurrent scheduler to schedule work from work queues belonging to a single CUDA context.”
but when I profiled multiple processes with visual profiler, kernerls from different contexts execute concurrently although I don’t use Multi-Process Sevice(mps).
also, The document of mps explained that it permits CUDA kernels execute simultaneously to achieve higher utilization. However, When I use Multi-Process Service(mps), visual profiler shows that mps allows kernels to overlap slightly, but most execution times do not overlap. So I don’t know why mps gives us more chance to utilize GPU resources.