Type M+C in nvidia-smi output

Hello,

Could anyone kindly explain what the type “M+C” means in the nvidia-smi output? From “man nvidia-smi”, it’s clear that C is for Compute and G is for Graphics, but what about M?

Some background: I am using 32 MPI processes and 8 GPUs on a single node. Cuda-mps-server is activated on each MPI process. The “M+C” type is only seen with GPU 0, while all other GPU processes see “C” only. Below is a sample output:

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 189383 M+C …0/path_to_exec 3109MiB |
| 0 N/A N/A 189384 M+C …0/path_to_exec 3197MiB |
| 0 N/A N/A 189387 M+C …0/path_to_exec 2893MiB |
| 0 N/A N/A 189394 M+C …0/path_to_exec 3189MiB |
| 0 N/A N/A 208219 C nvidia-cuda-mps-server 27MiB |
| 1 N/A N/A 189405 C …0/path_to_exec 2987MiB |
| 1 N/A N/A 189417 C …0/path_to_exec 3181MiB |
| 1 N/A N/A 189435 C …0/path_to_exec 3023MiB |
| 1 N/A N/A 189456 C …0/path_to_exec 3223MiB |
| 1 N/A N/A 208219 C nvidia-cuda-mps-server 27MiB |
| 2 N/A N/A 189481 C …0/path_to_exec 3213MiB |
| 2 N/A N/A 189513 C …0/path_to_exec 3225MiB |
| 2 N/A N/A 189545 C …0/path_to_exec 3191MiB |
| 2 N/A N/A 189581 C …0/path_to_exec 3193MiB |
| 2 N/A N/A 208219 C nvidia-cuda-mps-server 27MiB |
| 3 N/A N/A 189620 C …0/path_to_exec 3233MiB |
| 3 N/A N/A 189667 C …0/path_to_exec 3231MiB |
| 3 N/A N/A 189713 C …0/path_to_exec 2921MiB |
| 3 N/A N/A 189765 C …0/path_to_exec 3167MiB |
| 3 N/A N/A 208219 C nvidia-cuda-mps-server 27MiB |
| 4 N/A N/A 189822 C …0/path_to_exec 3131MiB |
| 4 N/A N/A 189925 C …0/path_to_exec 3027MiB |
| 4 N/A N/A 190015 C …0/path_to_exec 2683MiB |
| 4 N/A N/A 190097 C …0/path_to_exec 3219MiB |
| 4 N/A N/A 208219 C nvidia-cuda-mps-server 27MiB |
| 5 N/A N/A 190187 C …0/path_to_exec 3179MiB |
| 5 N/A N/A 190264 C …0/path_to_exec 3215MiB |
| 5 N/A N/A 190335 C …0/path_to_exec 3121MiB |
| 5 N/A N/A 190408 C …0/path_to_exec 2985MiB |
| 5 N/A N/A 208219 C nvidia-cuda-mps-server 27MiB |
| 6 N/A N/A 190481 C …0/path_to_exec 3231MiB |
| 6 N/A N/A 190561 C …0/path_to_exec 3125MiB |
| 6 N/A N/A 190638 C …0/path_to_exec 3095MiB |
| 6 N/A N/A 190702 C …0/path_to_exec 3175MiB |
| 6 N/A N/A 208219 C nvidia-cuda-mps-server 27MiB |
| 7 N/A N/A 190778 C …0/path_to_exec 2783MiB |
| 7 N/A N/A 190853 C …0/path_to_exec 3077MiB |
| 7 N/A N/A 190943 C …0/path_to_exec 3203MiB |
| 7 N/A N/A 191047 C …0/path_to_exec 3239MiB |
| 7 N/A N/A 208219 C nvidia-cuda-mps-server 27MiB |
±----------------------------------------------------------------------------+

Thanks,
Shine

it refers to a process using an “MPS compute context” on that device. I don’t have a detailed explanation.

Thank you for the link and the info, Robert. I have started reading the documentation for nvidia-smi etc to better understand “MPS compute context”. Meanwhile, do you know why only the first GPU reports “M+C”?

No, I don’t (I anticipated that question). The first thing I would guess is that it is simply reporting actual fact: you are using MPS on the first GPU but not on the others. I imagine you don’t think this is the case, but I would always assume mistakes are possible. Yes, I can see all the MPS servers, but this doesn’t rule out all possible issues. For example, MPS servers are user-specific. I don’t know the footprint of your jobs, and/or whether you’ve made mistakes in your MPI scripts, etc. regarding process placement.