Hi Nvidia team,
I am trying to deploy the Docker Deepstream with the triton version in Docker Swarm Mode. However, the docker container quits right after the container started and I do not know the reason.
Here is the step that I tested whether the image can be deployed by Docker Swarm or not:
-
Start docker swarm and execute the command “nvidia-smi” and sleep forever: “docker service create --replicas 1 --name swarm-gpu-test-ds nvcr.io/nvidia/deepstream:6.1-triton bash -c “nvidia-smi && sleep infinity””
-
Then, the logs of the running container show:
===============================
DeepStreamSDK 6.1.0*** LICENSE AGREEMENT ***
By using this software you agree to fully comply with the terms and conditions
of the License Agreement. The License Agreement is located at
/opt/nvidia/deepstream/deepstream/LicenseAgreement.pdf. If you do not agree
to the terms and conditions of the License Agreement do not use the software.=============================
== Triton Inference Server ==NVIDIA Release 22.03 (build 33743047)
Triton Server Version 2.20.0Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
- The logs did not show any error. However, the Docker Swarm shows that “Detected task failure” and initializes a different docker container.
However, when I run with a different docker image in Docker Swarm: docker service create --replicas 1 --name swarm-gpu-test-ds nvcr.io/nvidia/deepstream:6.1-samples bash -c “nvidia-smi && sleep infinity”, the docker swarm can start the container without any error, and the command “nvidia-smi” was executed, then the container was put to sleep.
I verify that the Nvidia Docker Container was installed successfully. I verify it by running the command: “sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi”.
Also, the GPU is already enabled for Docker Swarm. I verify it by running: “docker service create --replicas 1 --name swarm-gpu-test-default nvidia/cuda:11.6.2-base-ubuntu20.04 bash -c “nvidia-smi && sleep infinity””.
Moreover, I can run the Docker Swarm with Triton image: “docker service create --replicas 1 --name swarm-gpu-test-ds nvcr.io/nvidia/tritonserver:22.02-py3 bash -c “nvidia-smi && sleep infinity”” without any problem.
Furthermore, I can use docker run or docker-compose to run deepstream nvcr.io/nvidia/deepstream:6.1-triton without any error.
I also tried with nvcr.io/nvidia/deepstream:6.0.1-triton and nvcr.io/nvidia/deepstream:6.2-triton, but none of them can be deployed.
What should I do so that I can deploy with the image “nvcr.io/nvidia/deepstream:6.1-triton”?
• Hardware Platform: GPU 1080ti, Ubuntu 20.04.5 LTS
• DeepStream Version: 6.0.1, 6.1, 6.2
• NVIDIA GPU Driver Version (valid for GPU only): Driver Version: 530.30.02; CUDA Version: 12.1