Misunderstand about MPS non-uniform partitioning

byte.xiaobin · March 28, 2025, 3:24am

How to understand non-uniform partitioning in MPS document.

The limit constrained by the non-uniform active thread percentage is configured for every client CUDA context and can be changed throughout the client process.

From what I understand using CUDA_MPS_ACTIVE_THREAD_PERCENTAGE=25 is uniform partitioning - once I start the process with this env. var set to 25%, I cannot change the %age allocated to this process. With non-uniform, it seems like you can edit the active %age after starting the process. Is that true and how to edit active %age when the process is running? (Same question in https://forums.developer.nvidia.com/t/mps-set-default-active-thread-percentage-not-working-as-expected/194593?u=byte.xiaobin)

kperelygin · April 11, 2026, 4:45pm

You cannot edit the percentage assigned to a process today. We are actively looking into it.

If you control the application source code, you could handle this by creating multiple contexts and switching which context you submit to. This is much easier to accomplish with execution contexts now ( CUDA Runtime API :: CUDA Toolkit Documentation ).

Topic		Replies	Views
Can I dynamically change CUDA_MPS_ACTIVE_THREAD_PERCENTAGE to a running MPS process? CUDA Programming and Performance	3	629	April 11, 2026
Multi-Process Service Active Thread Percentage CUDA Programming and Performance	0	517	May 5, 2022
MPS set_default_active_thread_percentage not working as expected CUDA Programming and Performance	3	2233	November 23, 2021
Multi-Process Service setting CUDA_MPS_ACTIVE_THREAD_PERCENTAGE variable while application is running DGX Systems (Data Center)	1	712	May 8, 2025
MPS: Limiting threads to different thresholds for multi-GPU processes CUDA Programming and Performance tensorflow , kernel , ubuntu , python , linux	1	772	October 27, 2021
Improving MPS performance using Volta MPS Execution Resource Provisioning CUDA Programming and Performance	5	1477	July 4, 2019
How to Enforce Per-Client Memory and SM Limits in CUDA MPS? CUDA Programming and Performance cuda , kernel , inception	1	148	August 13, 2025
Mutli Process Service crashes on setting up the `CUDA_MPS_ACTIVE_THREAD_PERCENTAGE` when launching a huge number of processes (say around 40~48 ) CUDA Programming and Performance cuda , kernel , gpu , gpu-computing	0	778	August 11, 2023
Can CUDA MPS limit the GPU memory usage of a client process? CUDA Programming and Performance	1	790	May 7, 2020
General question on MPS set_active_thread_percentage CUDA Developer Tools	0	639	December 14, 2020

Misunderstand about MPS non-uniform partitioning

Related topics