Is default kernel execution concurrent? Or we have to enable MPS?

If I launch kernels from different processes on the same device (for example A100), are kernels executed concurrently?
Or MPS is necessary to enable concurrent kernel execution?

Best
Max

No, they are not automatically run concurrently from separate processes. That is one of the key features of MPS. Resource requirements for observing concurrent kernel execution would also have to be met, of course. A kernel that by itself uses up available execution resources on a GPU should not be expected to run concurrently with another kernel.

Thank you for the answer.
Is there any way to run kernels concurrently (if resource is enough) without MPS?

Best
Max

From different processes? No.

You could enable MIG mode for A100.

If from the same process, any method to make it?

Right now I am working on MIG, I need CMK as the baseline.

Sure, the CUDA concurrent kernels sample code could be a starting point/demonstrator. Plus there are probably dozens of questions on various forums from people trying to witness concurrent kernels.

Thank you for the help. I will have a look on it.