Multiple (>2) nvinferserver ensembles for Triton server with Deepstream client

• Hardware Platform (GPU)
• DeepStream Version: 6.1
• TensorRT Version : cuda11.3-trt8.0.1.6-ga
• NVIDIA GPU Driver Version : 470.103.01
• Issue Type : questions


I am trying to execute an application with 3 NvInferServer(PGIE, SGIE1, SGIE2) ensembles for triton server.

My pipeline is such that i sent the input from DS client to First PGIE ensemble triton server and after the inference, parse the output meta data back to DS client using a c++ parser. Input frame and first inference’s output meta data being sent to Second ensemble Triton server (SGIE-1) with clipped data using nvinferserver’s config. Now, I am parsing the output meta data to DS client by adding it in Custom User Meta data structure. The same structure is used to keep the first inference data as well.

For Third NvInferServer inference (SGIE-2), We use extraInputProcess() from C++ custom parser sample to extract the user meta to different inputs as per the Triton ensemble config. We are not able to process the meta data to Third NvInferserver ensemble triton server. Can you please confirm that it is supported to execute three NvInferServer serially?

If yes, Can you briefly elaborate the end to end pipeline of my application using Nvidia Deepstream and Triton’s capabilities? How can we leverage Deepstream and Triton’s ability to serially execute three NvInferServer with managing the output meta data between these different triton ensembles.

I would really appreciate if a sample app is available to provide 3 Nvinferserver of Triton server from a deepstream client for the reference.

Let me know in case of any queries.


One more thing to add, We have a conditional inference, If first inference(PGIE1)'s detection box size is less then a specific size, That Detection box’s clipped frame with added margin will be given to second inference. Not all detections from first inference will be fed to second nvinferserver(SGIE1). We are managing it using a custom user meta data and adding the output meta data for both the NvInferservers for a specific frame meta and feeding it to Third inference(SGIE2). Hope this helps.


  1. please refer to deepstream sample deepstream-infer-tensor-meta-test and deepstream-app-triton/source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt, they all will use 3 NvInferServer(PGIE, SGIE1, SGIE2).
  2. please refer to TritonOnnxYolo for nvinferserver extraInputProcess.
  3. about " That Detection box’s clipped frame with added margin will be given to second inference.", you can customize postprocess parsing function, please refer to custom_parse_bbox_func: “NvDsInferParseCustomTfSSD” in opt\nvidia\deepstream\deepstream\samples\configs\deepstream-app-triton\config_infer_primary_detector_ssd_mobilenet_v1_coco_2018_01_28.txt

Hi, Given sample is such that PGIE provides input to SGIE1 and SGIE2. As in clipped object/frame to be given to both the SGIEs.

We are looking for something which has a serially cascading pipeline. PGIE to SGIE1 to SGIE2. PGIE and SGIE1’s output meta data to be merged and given to SGIE2.

Hope this helps.

please refer to GitHub - NVIDIA-AI-IOT/deepstream_lpr_app: Sample app code for LPR deployment on DeepStream, the pgie detects vehicles, the sgie1 detects car license plate based on pgie’s output, the sgie2 detects car license plate text based on sgie’s output.

1 Like

We have found the issue, It was related to adding unique id to object along side class id in the custom parser of SGIE1. We were adding the same unique id to both the PGIE and SGIE’s custom parser but keeping it as per the config of respective target GIE made it sequential. We are able to feed the data from SGIE1 to SGIE2 now.

Thanks for the help.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.