Concurrency in MPS and multi-stream

zhiqi.0 · December 15, 2020, 9:04am

Hi, I’m a little confused with the way MPS and multi-stream works. I keep seeing “concurrency” when describing MPS and multi-stream, but don’t know the precise meaning.

The question is: do MPS and multi-stream execute kernels in parallel, i.e., computation resources will be partitioned for different kernels so that computation of different kernels will be done simultaneously? Or just the way of kernel switching, i.e., if one kernel is waiting for the memory data fetching, schedule the next kernel for computation?

Robert_Crovella · December 16, 2020, 8:52pm

Streams only have meaning relative to a particular CPU process. Streams are an important part of the CUDA methodology to arrange for asynchronous concurrency within a process. In order to witness concurrent kernel execution from a single process, stream usage is necessary. This topic is covered in many places and there is a CUDA concurrentKernels sample code you may wish to study.

MPS is, as its name indicates, an inter process mechanism. It is used to help allow 2 or more processes that wish to share a GPU do so more efficiently. One of the features of MPS is that it may, under some circumstances, allow kernels from separate processes to use the GPU simultaneously, including to run concurrently. The CUDA MPS docs (a simple google search will find it for you) will cover more details.

From usage standpoint, the usage of these two features is mostly orthogonal. You can use streams with or without MPS, streams do not depend on MPS for the features they provide. You can use MPS with or without streams (in each process); MPS does not depend on process stream usage for the features it provides.

Neither mechanism guarantees kernel concurrency. They are necessary but not sufficient conditions to witness kernel concurrency in each of their respective scenarios (single process, multi-process).

Topic		Replies	Views
Question about GPU sharing of Multi-process service CUDA Programming and Performance	9	6371	April 30, 2018
Cocurrent execution with MPS CUDA Programming and Performance	5	524	November 11, 2020
Fine grained Kernel scheduling with MPS CUDA Programming and Performance tensorflow , kernel , ubuntu , python , linux	8	1367	May 8, 2023
Do kernels/streams execute concurrently? CUDA Programming and Performance	1	1175	October 15, 2008
Is default kernel execution concurrent? Or we have to enable MPS? CUDA Programming and Performance	8	396	May 3, 2023
Parallell Processing for CPU + GPU with & without CUDA usage CUDA Programming and Performance tensorflow , ubuntu	5	1185	March 5, 2021
Difference between vGPU and CUDA MPS CUDA Programming and Performance	4	1751	November 25, 2020
CUDA 4.0 concurrent kernels CUDA Programming and Performance	6	1670	March 28, 2011
GPU sharing among different application with different CUDA context CUDA Programming and Performance	23	18100	December 17, 2020
Distinct Kernels on Concurrent Streams? CUDA Programming and Performance	3	1210	June 9, 2009

Concurrency in MPS and multi-stream

Related topics