Unaccurate # SMs when using MPS

namch0101 · July 29, 2025, 6:07am

Hello,
I’m recently studying changes of # SMs when using Multi-Process-Service.
As far as I know, # SMs change according to MPS percentage.
However, It does not change as I thought and seems to have specific rule.

For example, I’m using RTX 5060ti which has 36 SMs in total. I ran the test MPS percentage[60%]
AFAIK, the # SMs should be 22 or 21 since 36 * 0.6[60%] is 21.6. However, Nsight compute tells me that # SMs is 20, which is not what i expected.

Also, when i run the test with MPS percentage 20%, # of SM gets rounded “up” and becomes 8(36 * 0.2 = 7.2, 7.2 ->8).
However, when i run the test with MPS percentage 80%, # of SM gets rounded “down” and becomes 28(36 * 0.8 = 28.8, 28.8 → 28).

It seems like # SMs changes according to the policy that i’m not tracking.
I wish someone can explain or hand me any link that can help me.

Thank you!

veraj · August 13, 2025, 9:47am

Hi, @namch0101

Yes. The value gets rounded up or down in different cases.

When using the env variable CUDA_MPS_ACTIVE_THREAD_PERCENTAGE , it gets rounded down
When using cuCtxCreate_v3 (or green contexts), it will get rounded up. Refer here https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__GREEN__CONTEXTS.html

The reason the env variable is rounded down, because if you launch two applications at CUDA_MPS_ACTIVE_THREAD_PERCENTAGE=50, you expect both to fit exactly, not oversubscribe. So if the answer is something like 21.6, you do not want to use 22, as that would mean 2 SMs most likely would overlap and be shared across those two clients.

veraj · August 27, 2025, 9:48am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to monitor # of SM used on a CUDA kernel under MPS SM oversubscribed environment? CUDA Programming and Performance	0	62	November 1, 2024
Nsight Compute metrics value confused Nsight Compute performance-metrics	1	1189	December 14, 2021
Updating SM count of an MPS context after it is created CUDA Programming and Performance cuda	5	917	February 26, 2022
Multi-Process Service Active Thread Percentage CUDA Programming and Performance	0	517	May 5, 2022
Improving MPS performance using Volta MPS Execution Resource Provisioning CUDA Programming and Performance	5	1477	July 4, 2019
How to Enforce Per-Client Memory and SM Limits in CUDA MPS? CUDA Programming and Performance cuda , kernel , inception	1	148	August 13, 2025
How to use CUDA Green Context with MPS CUDA Programming and Performance cuda , kernel	1	589	December 20, 2024
I have a question about compute throughput on nsight compute Nsight Compute	2	123	August 16, 2025
How to limit the number of SMs used by a program? CUDA Programming and Performance	7	330	March 31, 2025
How to get SM utilization metrics for processes using MPS? CUDA Programming and Performance	1	254	November 5, 2024

Unaccurate # SMs when using MPS

Related topics