MPS raises "Command handle failed" error with container

I can use MPS with my application (non container)
And when I try to start a Triton inference container, it doesn’t work.
The error message is “handle failed error”, but I cannot find any information on google.

I use this command to start MPS, I use foreground to see message

sudo nvidia-cuda-mps-control -f

User this command to start Triton

docker run -it --gpus 2 --rm -v /tmp/nvidia-mps:/tmp/nvidia-mps \
 --ipc=host -p 8000:8000 -p 8001:8001 -p 8002:8002 \
nvcr.io/nvidia/tritonserver:25.01-py3 bash

This is the message of MPS

[2025-02-08 16:38:28.882 Control 1807548] NEW SERVER 1809427: Ready
[2025-02-08 16:38:28.882 Server 1809427] Active Threads Percentage set to 100.0
[2025-02-08 16:38:28.882 Server 1809427] Server Priority set to 0
[2025-02-08 16:38:28.882 Server 1809427] Server has started
[2025-02-08 16:38:28.882 Server 1809427] Received new client request
[2025-02-08 16:38:28.882 Server 1809427] Worker created
[2025-02-08 16:38:28.883 Server 1809427] Creating worker thread
[2025-02-08 16:38:28.883 Server 1809427] Command handle failed
1 Like