Can MPS control per gpu QOS if multiple GPUs are managed by MPS?

Accelerated Computing CUDA CUDA Programming and Performance

cjluo January 25, 2019, 11:20pm 1

Hi,

I’m researching if it is possible to control GPU QOS through MPS when I have multiple GPUs.

Say if a process is going to use both GPU 0 and 1, which are all manged by MPS, is it possible that I set the utilization limit on GPU0 to be 50% and GPU1 to be 20%?

I found CUDA_MPS_ACTIVE_THREAD_PERCENTAGE is the environment variable that could control per client QOS, but how does it work if the process runs over multiple GPUs?

Thanks!

Topic		Replies	Views
Hyper-Q for sharing GPUs CUDA Programming and Performance	1	681	April 7, 2017
MPS: Limiting threads to different thresholds for multi-GPU processes CUDA Programming and Performance tensorflow , kernel , ubuntu , python , linux	1	772	October 27, 2021
Can CUDA MPS limit the GPU memory usage of a client process? CUDA Programming and Performance	1	790	May 7, 2020
MPS thread limit and 100% GPU usage CUDA Programming and Performance	7	648	August 14, 2025
Intereference between client on MPS CUDA Programming and Performance	0	64	October 25, 2024
How to control the resource of each client in NVIDIA-MPS CUDA Programming and Performance cuda	3	1098	October 26, 2021
Improving MPS performance using Volta MPS Execution Resource Provisioning CUDA Programming and Performance	5	1477	July 4, 2019
Question about CUDA MPS CUDA Programming and Performance	15	3226	August 22, 2022
MPS with multiGPUs Triton Inference Server (archived)	0	1030	May 9, 2020
Configuring multiple Volta MPS servers for execution resource provisioning CUDA Programming and Performance	2	1257	December 6, 2018

Can MPS control per gpu QOS if multiple GPUs are managed by MPS?

Related topics