Docker pause leads to monopolizing GPU when Volta MPS on

Hi, I’m implementing a Persistent Thread style CUDA program on Apache OpenWhisk.

My Persistent Thread Style CUDA process is waiting for a new command via named pipe within the docker container, and then, after one execution the container would be paused.
The problem is no new CUDA process can be started when the container is paused and Volta MPS is on.

What if MPS is off? no problems, a new CUDA process can be executed.
What if send STOP signal to Persistent Thread Style CUDA process running on native env rather than within container? no problems, a new CUDA process can be executed.

Below example is How I tested.
0. environment
Ubuntu 18.04, 5.4.0, x86_64
A100 PCIe 40GB, Driver version 520.61.05 + CUDA 11.8.0 + Volta MPS
Docker version 20.10.7, build 20.10.7-0ubuntu5~18.04.3

  1. Start MPS
    $su
    #nvidia-cuda-mps-control -d

  2. Run docker
    $docker run -it --gpus=all -v /tmp/nvidia-mps --ipc=host ${MyDockerImageFrom_nvidia/cuda:11.8.0-runtime-ubuntu18.04} bash
    #mkfifo fifo1 // create named pipe
    #./exec.exe // run Persistent Thread Style CUDA process
    #echo ${command} > fifo1 // success

  3. Pause the container
    $docker pause ${the_container} // with my knowledge, docker pause is based on cgroup freezing but I’m not familiar with cgroup

  4. Run a new CUDA process on native(or within another container shows same problem)
    $./run // can not be executed, not appeared on nvidia-smi

  5. If kill the Persistent Thread Style CUDA process
    #kill $(pidof exec.exe)
    // 4. can start

How can I solve this problem?
Any suggestions will be thankful