Accelerate doesn't work with Triton Inference Server

When I run an inference script with python inside the nvcr.io/nvidia/tritonserver:23.04-py3 container - it works fine with huggingface’s accelerate library. But when I run with triton server with python_backend_stub accelerate fails to access GPUs and errors out with RuntimeError: CUDA error: CUDA-capable device(s) is/are busy or unavailable. How do I fix this?