Is it posible to run Triton Server on GPU device and Gstreamer with Nvinterserver in a CPU-Only device?

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.4
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs) Question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hi, I am currently running deepstream with nvinfer with success for our application, but we are having GPU quota issues in Azure that limits the amount of T4 Vms we can use. We would like to transition to multiple models and we would like to check whether the following setup is viable:

  • GPU devices: 4 vCPUs servers (with T4 GPU) running only Triton Inference Server with the models attached to a MLOps pipeline.

  • CPU-Only devices: 4 vCPUs (without GPU) running a cpu-only gstreamer pipeline with the “nvinferserver” plugin to make the inference using gRPC Triton calls. All the other parts of the pipeline uses CPU-only alternatives, including tracking.

We would have multiple CPU-only devices sharing the same Triton Server to perform inference on multiple models.

Is that setup viable? Can we run “nvinferserver” plugin on a CPU-only device connect to Triton inference servers using gRPC? If so, how can I compile only the nvinferserver plugin without CUDA access?

Thanks!

“nvinferserver” plugin needs GPU even with grpc mode. It is impossible.

Thanks for the answer, is there any other gstreamer plugin (cpu-only) that can communicate with Triton using gRPC?

I don’t think there is any other GStreamer plugin for such case.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.