Cannot start nvcr.io/nvidia/deepstream:6.1-triton container in Docker Swarm

tangngoctuantttt · May 30, 2023, 6:58am

Hi Nvidia team,

I am trying to deploy the Docker Deepstream with the triton version in Docker Swarm Mode. However, the docker container quits right after the container started and I do not know the reason.

Here is the step that I tested whether the image can be deployed by Docker Swarm or not:

Start docker swarm and execute the command “nvidia-smi” and sleep forever: “docker service create --replicas 1 --name swarm-gpu-test-ds nvcr.io/nvidia/deepstream:6.1-triton bash -c “nvidia-smi && sleep infinity””
Then, the logs of the running container show:

===============================
DeepStreamSDK 6.1.0

*** LICENSE AGREEMENT ***
By using this software you agree to fully comply with the terms and conditions
of the License Agreement. The License Agreement is located at
/opt/nvidia/deepstream/deepstream/LicenseAgreement.pdf. If you do not agree
to the terms and conditions of the License Agreement do not use the software.

=============================
== Triton Inference Server ==

NVIDIA Release 22.03 (build 33743047)
Triton Server Version 2.20.0

Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

The logs did not show any error. However, the Docker Swarm shows that “Detected task failure” and initializes a different docker container.

However, when I run with a different docker image in Docker Swarm: docker service create --replicas 1 --name swarm-gpu-test-ds nvcr.io/nvidia/deepstream:6.1-samples bash -c “nvidia-smi && sleep infinity”, the docker swarm can start the container without any error, and the command “nvidia-smi” was executed, then the container was put to sleep.

I verify that the Nvidia Docker Container was installed successfully. I verify it by running the command: “sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi”.

Also, the GPU is already enabled for Docker Swarm. I verify it by running: “docker service create --replicas 1 --name swarm-gpu-test-default nvidia/cuda:11.6.2-base-ubuntu20.04 bash -c “nvidia-smi && sleep infinity””.

Moreover, I can run the Docker Swarm with Triton image: “docker service create --replicas 1 --name swarm-gpu-test-ds nvcr.io/nvidia/tritonserver:22.02-py3 bash -c “nvidia-smi && sleep infinity”” without any problem.

Furthermore, I can use docker run or docker-compose to run deepstream nvcr.io/nvidia/deepstream:6.1-triton without any error.

I also tried with nvcr.io/nvidia/deepstream:6.0.1-triton and nvcr.io/nvidia/deepstream:6.2-triton, but none of them can be deployed.

What should I do so that I can deploy with the image “nvcr.io/nvidia/deepstream:6.1-triton”?

• Hardware Platform: GPU 1080ti, Ubuntu 20.04.5 LTS
• DeepStream Version: 6.0.1, 6.1, 6.2
• NVIDIA GPU Driver Version (valid for GPU only): Driver Version: 530.30.02; CUDA Version: 12.1

Fiona.Chen · May 31, 2023, 1:24am

Can you make sure the driver version matches the DeepStream compatibility? Quickstart Guide — DeepStream 6.2 Release documentation. E.G. if you want to run DeepStream 6.1 docker, the driver version in the host should be 515.65.01.
Driver 530.30.02 is just a beta version.

tangngoctuantttt · May 31, 2023, 2:44am

Thank you for your reply.

I also try with Driver Version: 515.65.01 and reboot the computer. However, the exact same problem occurred.

In summary:

Command that work:

docker service create --replicas 1 --name swarm-gpu-test-ds nvcr.io/nvidia/deepstream:6.1-samples bash -c “nvidia-smi && sleep infinity”
sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
docker service create --replicas 1 --name swarm-gpu-test-default nvidia/cuda:11.6.2-base-ubuntu20.04 bash -c “nvidia-smi && sleep infinity"
docker service create --replicas 1 --name swarm-gpu-test-ds nvcr.io/nvidia/tritonserver:22.02-py3 bash -c “nvidia-smi && sleep infinity”
docker-compose to run deepstream nvcr.io/nvidia/deepstream:6.1-triton without any error

The only container related to deepstream and triton does not work: docker service create --replicas 1 --name swarm-gpu-test-ds nvcr.io/nvidia/deepstream:6.1-triton bash -c “nvidia-smi && sleep infinity”

Can you guess what is the problem here?

Fiona.Chen · June 1, 2023, 1:58am

We are investigating the issue, will be back when there is any progress.

tangngoctuantttt · June 1, 2023, 2:54am

Hi @Fiona.Chen ,

I figured out the error.
I noticed that the docker container started without any problem, but does not accept the command.
As a result, I explicitly passed the entrypoint to the command and it worked perfectly.

Therefore, the solution should be:

Instead of using:

docker service create --replicas 1 --name swarm-gpu-test-ds nvcr.io/nvidia/deepstream:6.1-triton bash -c “nvidia-smi && sleep infinity”

Please use:

docker service create --replicas 1 --name swarm-gpu-test-ds --entrypoint “bash -c ‘nvidia-smi && sleep infinity’” nvcr.io/nvidia/deepstream:6.1-triton

Thus, there should not be any further errors.
Thank you so much for your help.

system · June 15, 2023, 2:55am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deepstream-app 5.1 triton container not initializing on Tesla V100 DeepStream SDK	2	363	October 12, 2021
Deepstream docker nvidia DeepStream SDK tensorrt , gstreamer	2	581	January 25, 2022
Not able to launch nvcr.io/nvidia/deepstream:5.0.1-20.09-triton via Kubernetes DeepStream SDK gstreamer	3	401	October 12, 2021
Fail to use the Docker Deepstream 7.1 Dev DeepStream SDK deepstream	5	115	January 16, 2025
Fails to run deepstream-test1 example in docker container deepstream:6.2-devel DeepStream SDK deepstream	5	391	November 23, 2023
WARNING: erroneous pipeline: no element "nvinfer" On nvcr.io/nvidia/deepstream:6.1-devel Docker DeepStream SDK docker	5	582	June 27, 2022
Can you tell me that i use deepstream6.1 docker version in jetson DeepStream SDK	17	43	August 20, 2024
Unable to run Deepstream Docker Reference application DeepStream SDK	2	515	April 21, 2023
Jetson running Docker with DeepStream 6.0 and Triton Server DeepStream SDK docker , inference-server-triton , jetson	6	2058	November 19, 2021
cudaErrorSystemDriverMismatch error when trying to use nvinfer inside Deepstream6.4 docker container DeepStream SDK cuda , docker	4	291	May 14, 2024

Cannot start nvcr.io/nvidia/deepstream:6.1-triton container in Docker Swarm

===============================
DeepStreamSDK 6.1.0

=============================
== Triton Inference Server ==

Cannot start nvcr.io/nvidia/deepstream:6.1-triton container in Docker Swarm

=============================== DeepStreamSDK 6.1.0

============================= == Triton Inference Server ==

Related topics

===============================
DeepStreamSDK 6.1.0

=============================
== Triton Inference Server ==