Is it posible to run Triton Server on GPU device and Gstreamer with Nvinterserver in a CPU-Only device?

libardi · April 25, 2024, 5:37pm

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.4
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs) Question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hi, I am currently running deepstream with nvinfer with success for our application, but we are having GPU quota issues in Azure that limits the amount of T4 Vms we can use. We would like to transition to multiple models and we would like to check whether the following setup is viable:

GPU devices: 4 vCPUs servers (with T4 GPU) running only Triton Inference Server with the models attached to a MLOps pipeline.
CPU-Only devices: 4 vCPUs (without GPU) running a cpu-only gstreamer pipeline with the “nvinferserver” plugin to make the inference using gRPC Triton calls. All the other parts of the pipeline uses CPU-only alternatives, including tracking.

We would have multiple CPU-only devices sharing the same Triton Server to perform inference on multiple models.

Is that setup viable? Can we run “nvinferserver” plugin on a CPU-only device connect to Triton inference servers using gRPC? If so, how can I compile only the nvinferserver plugin without CUDA access?

Thanks!

Fiona.Chen · April 26, 2024, 1:52am

“nvinferserver” plugin needs GPU even with grpc mode. It is impossible.

libardi · April 26, 2024, 11:26am

Thanks for the answer, is there any other gstreamer plugin (cpu-only) that can communicate with Triton using gRPC?

Fiona.Chen · April 28, 2024, 2:40am

I don’t think there is any other GStreamer plugin for such case.

system · May 12, 2024, 2:41am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deepstream with standalone triton server DeepStream SDK	4	893	November 1, 2021
DeepStream Container is unable to connect to Triton Inference Server Container through GRPC DeepStream SDK	7	806	April 26, 2022
DeepStream Gst-nvinferserver features to run triton inference server DeepStream SDK	3	325	January 9, 2024
Does nvinferserver supports load balancing between two GPU's? DeepStream SDK	6	689	November 9, 2021
Modifying gst-nvinfer plugin DeepStream SDK	2	817	December 13, 2021
Do I need triton client in order to comunicate with triton-deepstream? DeepStream SDK gstreamer , python	6	484	October 13, 2022
JetsonNano - Using triton inference server via the DeepStream gstreamer plugin DeepStream SDK tensorrt , jetson-inference , gstreamer , inference-server-triton	3	1441	July 20, 2022
Nvinferserver Deepstream DeepStream SDK	6	243	May 31, 2023
Deepstream - Use standalone Triton server? DeepStream SDK	10	1331	October 12, 2021
Triton Server GPU memory copy? DeepStream SDK	10	1158	May 10, 2023

Is it posible to run Triton Server on GPU device and Gstreamer with Nvinterserver in a CPU-Only device?

Related topics