MPS with multiGPUs

Hi,
I am using MPS with multiple GPUs. One thing I noticed is that client applications are all assigned to GPU-0, while other GPUs are idling.

here is nvidia-smi outputs:

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1911      C   nvidia-cuda-mps-server                        29MiB |
|    0      2742    M+C   python                                       707MiB |
|    0      2806    M+C   python                                       707MiB |
|    0      2872    M+C   python                                       707MiB |
|    0      2946    M+C   python                                       707MiB |
|    0      3031    M+C   python                                       707MiB |
|    0      3124    M+C   python                                       707MiB |
|    1      1911      C   nvidia-cuda-mps-server                        29MiB |
|    2      1911      C   nvidia-cuda-mps-server                        29MiB |
|    3      1911      C   nvidia-cuda-mps-server                        29MiB |
+-----------------------------------------------------------------------------+

I have tried set_active_thread_percentage command to set the percentage to be 5. I expect to see the some worker thread will be assigned to other GPUs.

Any idea why this happened?