MPS interference problem

namch0101 · November 12, 2024, 5:18am

Hello,
I have a question about interference between clients of MPS(Multi-Process Service). I set MPS percentage 50 for each process. According to NVIDIA MPS document, clients should not disturb each other much as they are concentrated on a set of SMs. However, the latency of computation increases when running more than 1 client process.

For example, when I run just one client process with 50% mps percentage, the latency of single forward computation was 100ms. However, when I run 2 client processes each with 50% mps percentage, the latency of single forward computation increases to 110ms on client 1 and 140ms on clinet 2.

I think it is something to do with bandwidth, but I want to know the reason of it for sure.
Also, is there any way to calculate the increases of computation latency in advance?

Robert_Crovella · November 12, 2024, 5:29am

One possible source would be contention for memory bandwidth. There might be other possibilities, such as host<->device bandwidth or perhaps other shared resources.

You can use a profiler to help discover how the applications are behaving in each case, and which subsystems are being used. With no description or measurement of your application(s), its not possible to provide the “reason of it for sure”.

I don’t know of a method to predict computation latency with no measurements of the application. Others may have ideas.

Topic		Replies	Views
Interference between MPS Client Process CUDA Programming and Performance	1	102	March 6, 2025
MPS vs no MPS: drastic increase in kernel latency CUDA Programming and Performance	3	295	June 19, 2025
Intereference between client on MPS CUDA Programming and Performance	0	65	October 25, 2024
Different latency when using MPS CUDA Programming and Performance	0	621	June 20, 2021
MPS has gotten really good, but can CUDA streams replicate the benefits? CUDA Programming and Performance	1	540	September 23, 2024
Question about CUDA MPS CUDA Programming and Performance	15	3233	August 22, 2022
Improving MPS performance using Volta MPS Execution Resource Provisioning CUDA Programming and Performance	5	1483	July 4, 2019
MULTI-PROCESS SERVICE(MPS) has no effect CUDA Programming and Performance	3	903	October 16, 2018
Cocurrent execution with MPS CUDA Programming and Performance	5	709	November 11, 2020
PCIe RX throughput rises quickly when using MPS CUDA Programming and Performance	2	596	October 12, 2021

MPS interference problem

Related topics