Utilizing Inference server for multi-batch processing with deepstream

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) x86 RTX-3060
• DeepStream Version 6.1
• JetPack Version (valid for Jetson only) N/A
• TensorRT Version 8.2.5.1
• Triton Version 2.24.0
• NVIDIA GPU Driver Version (valid for GPU only) 8.2.5.1
• Issue Type( questions, new requirements, bugs) questions
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
Utilizing
Hi,
I am trying to utilize the multi-batch inference using nvds-inferserver plugin to handle multiple input sources with the deepstream app , so ideally batch size should be equal to number of input sources to process, so as per the Gst-nvinferserver documentation the triton inference server need to be installed to handle multi-batch scenario, so i installed the triton-inference server from source(without docker) and other other components inclusing deppstream tensorrt also installed without docker. As deepstream app works fine with nvinfer plugin type but when tried for nvinferserver plugin with below app config settings,

[primary-gie]
enable=1
#(0): nvinfer; (1): nvinferserver
plugin-type=1
gpu-id=0
batch-size=1
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
gie-unique-id=1
nvbuf-memory-type=0
config-file=config/config_yoloV8.txt

After running ,

gst-inspect-1.0 nvinferserver

It’s showign below output,

(gst-plugin-scanner:195446): GStreamer-WARNING **: 21:58:39.579: Failed to load plugin '/usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_udp.so': librivermax.so.0: cannot open shared object file: No such file or directory

(gst-plugin-scanner:195446): GStreamer-WARNING **: 21:58:39.708: Failed to load plugin '/usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_ucx.so': libucs.so.0: cannot open shared object file: No such file or directory
Factory Details:
  Rank                     primary (256)
  Long-name                NvInferServer plugin
  Klass                    NvInferServer Plugin
  Description              Nvidia DeepStreamSDK TensorRT plugin
  Author                   NVIDIA Corporation. Deepstream for Tesla forum: https://devtalk.nvidia.com/default/board/209

Plugin Details:
  Name                     nvdsgst_inferserver
  Description              NVIDIA DeepStreamSDK TensorRT Inference Server plugin
  Filename                 /usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_inferserver.so
  Version                  6.1.0
  License                  Proprietary
  Source module            nvinferserver
  Binary package           NVIDIA DeepStreamSDK TensorRT Inference Server plugin
  Origin URL               http://nvidia.com/

GObject
 +----GInitiallyUnowned
       +----GstObject
             +----GstElement
:
gst-inspect-1.0 nvinferserver                   +----GstBaseTransform
                         +----GstNvInferServer

Pad Templates:
  SINK template: 'sink'
    Availability: Always
    Capabilities:
      video/x-raw(memory:NVMM)
                 format: { (string)NV12, (string)RGBA }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
  
  SRC template: 'src'
    Availability: Always
    Capabilities:
      video/x-raw(memory:NVMM)
                 format: { (string)NV12, (string)RGBA }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]

Element has no clocking capabilities.

Factory Details:
  Rank                     primary (256)
  Long-name                NvInferServer plugin
  Klass                    NvInferServer Plugin
  Description              Nvidia DeepStreamSDK TensorRT plugin
  Author                   NVIDIA Corporation. Deepstream for Tesla forum: https://devtalk.nvidia.com/default/board/209

Plugin Details:
  Name                     nvdsgst_inferserver
  Description              NVIDIA DeepStreamSDK TensorRT Inference Server plugin
  Filename                 /usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_inferserver.so
  Version                  6.1.0
  License                  Proprietary
  Source module            nvinferserver
  Binary package           NVIDIA DeepStreamSDK TensorRT Inference Server plugin
  Origin URL               http://nvidia.com/

GObject
 +----GInitiallyUnowned
       +----GstObject
             +----GstElement
                   +----GstBaseTransform
                         +----GstNvInferServer

Pad Templates:
  SRC template: 'src'
    Availability: Always
    Capabilities:
      video/x-raw(memory:NVMM)
                 format: { (string)NV12, (string)RGBA }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
  
  SINK template: 'sink'
    Availability: Always
    Capabilities:
      video/x-raw(memory:NVMM)
                 format: { (string)NV12, (string)RGBA }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]

Element has no clocking capabilities.
Element has no URI handling capabilities.

Pads:
  SINK: 'sink'
    Pad Template: 'sink'
  SRC: 'src'
    Pad Template: 'src'

Element Properties:
  batch-size          : Maximum batch size for inference
                        flags: readable, writable, changeable only in NULL or READY state
                        Unsigned Integer. Range: 0 - 1024 Default: 0 
  config-file-path    : Path to the configuration file for this instance of nvinferserver
                        flags: readable, writable, changeable in NULL, READY, PAUSED or PLAYING state
                        String. Default: ""
  infer-on-class-ids  : Operate on objects with specified class ids
                        Use string with values of class ids in ClassID (int) to set the property.
                         e.g. 0:2:3
                        flags: readable, writable, changeable only in NULL or READY state
                        String. Default: ""
  infer-on-gie-id     : Infer on metadata generated by GIE with this unique ID.
                        Set to -1 to infer on all metadata.
                        flags: readable, writable, changeable only in NULL or READY state
                        Integer. Range: -1 - 2147483647 Default: -1 
  interval            : Specifies number of consecutive batches to be skipped for inference
                        flags: readable, writable, changeable only in NULL or READY state
                        Unsigned Integer. Range: 0 - 2147483647 Default: 0 
  name                : The name of the object
                        flags: readable, writable
                        String. Default: "nvinferserver0"
  parent              : The parent of the object
                        flags: readable, writable
                        Object of type "GstObject"
  process-mode        : Inferserver processing mode, (0):None, (1)FullFrame, (2)ClipObject
                        flags: readable, writable, changeable only in NULL or READY state
                        Unsigned Integer. Range: 0 - 2 Default: 0 
  qos                 : Handle Quality-of-Service events
                        flags: readable, writable
                        Boolean. Default: false
  raw-output-generated-callback: Pointer to the raw output generated callback funtion
                        (type: gst_nvinfer_server_raw_output_generated_callback in 'gstnvdsinfer.h')
                        flags: readable, writable, changeable only in NULL or READY state
                        Pointer.
  raw-output-generated-userdata: Pointer to the userdata to be supplied with raw output generated callback
                        flags: readable, writable, changeable only in NULL or READY state
                        Pointer.
  unique-id           : Unique ID for the element. Can be used to identify output of the element
                        flags: readable, writable, changeable only in NULL or READY state
                        Unsigned Integer. Range: 0 - 4294967295 Default: 0 

As referring to some relevant issues like this one, it emphasizes on using the docker image from containers/deepstream but i am not comfortable with the docker system hence i installed everything manually without docker.
Hence after running the deepstream app it’s giving below error,

(ds-app:198149): GLib-GObject-WARNING **: 22:04:55.013: g_object_set_is_valid_property: object class 'GstNvInferServer' has no property named 'input-tensor-meta'
** INFO: <create_primary_gie_bin:145>: gpu-id: 0 in primary-gie group is ignored, only accept in nvinferserver's config
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
gstnvtracker: Batch processing is ON
gstnvtracker: Past frame output is OFF
[NvMultiObjectTracker] Initialized
[libprotobuf ERROR /home/amkale/jitendrak/Triton-Github/client/build/_deps/repo-third-party-build/grpc-repo/src/grpc/third_party/protobuf/src/google/protobuf/text_format.cc:317] Error parsing text-format nvdsinferserver.config.PluginControl: 61:1: Extension "property" is not defined or is not an extension of "nvdsinferserver.config.PluginControl".
0:00:00.122810548 198149 0x7f408c002330 WARN           nvinferserver gstnvinferserver_impl.cpp:441:start:<primary_gie> error: Configuration file parsing failed
0:00:00.122818756 198149 0x7f408c002330 WARN           nvinferserver gstnvinferserver_impl.cpp:441:start:<primary_gie> error: Config file path: /home/swap/ultraanalytics/Video-Analytics/config/config_yoloV8.txt
0:00:00.122831738 198149 0x7f408c002330 WARN           nvinferserver gstnvinferserver.cpp:459:gst_nvinfer_server_start:<primary_gie> error: gstnvinferserver_impl start failed
0:00:00.122836948 198149 0x7f408c002330 WARN                GST_PADS gstpad.c:1142:gst_pad_set_active:<primary_gie:sink> Failed to activate pad
[NvMultiObjectTracker] De-initialized
** ERROR: <main:716>: Failed to set pipeline to PAUSED
Quitting
ERROR from primary_gie: Configuration file parsing failed
Debug info: gstnvinferserver_impl.cpp(441): start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie:
Config file path: /home/swap/ultraanalytics/Video-Analytics/config/config_yoloV8.txt
ERROR from primary_gie: gstnvinferserver_impl start failed
Debug info: gstnvinferserver.cpp(459): gst_nvinfer_server_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie
App run failed

Any help on this would be appreciated,
Thank You!

as the log shown, parsing nvinferserver’s configuration file failed because “property” is not defined.

  1. please share the config_yoloV8.txt.
  2. please refer to /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app-triton/config_infer_plan_engine_primary.txt, which is a nvifnerserver’s configuration file.

Hi,
After going through the documentation and other relevant issues, i figured out the config file for primary-gie when using nv-inferserver must be in .pbtxt format which is different from the config format used by gst-nvinfer element.
Below is the contents of config_yoloV8_triton.txt

infer_config {
  unique_id: 1
  gpu_ids: [0]
  max_batch_size: 1
  backend {
    triton {
      model_name: "yolov8"
      version: -1
       model_repo {
        root: "../tensorrt/"
        log_level: 1
        tf_gpu_memory_fraction: 0.2
        tf_disable_soft_placement: 0
      }
    }
  }

  preprocess {
    network_format: IMAGE_FORMAT_RGB
    tensor_order: TENSOR_ORDER_LINEAR
    tensor_name: "x"
    maintain_aspect_ratio: 0
    frame_scaling_hw: FRAME_SCALING_HW_GPU
    frame_scaling_filter: 1
    normalize {
      scale_factor: 0.0039215697906911373
      channel_offsets: [0, 0, 0]
    }
  }

  postprocess {
    labelfile_path: "../labels.txt"
    detection {
      num_detected_classes: 80
      custom_parse_bbox_func: "NvDsInferParseCustomYoloV8"
      per_class_params {
        key: 0
        value { pre_threshold: 0.4 }
      }
      nms {
        confidence_threshold:0.2
        topk:20
        iou_threshold:0.5
      }
    }
  }

  custom_lib {
    path: "../plugins/libs/libnvdsinfer_custom_bbox_yoloV8.so"
  }

  extra {
    copy_input_to_host_buffers: false
    output_buffer_pool_size: 2
  }
}
input_control {
  process_mode: PROCESS_MODE_FULL_FRAME
  operate_on_gie_id: -1
  interval: 0
}

But providing this file alone doesn’t seems to be working hence i rearranged the dir structure where trt engine file is located to below,

tensorrt/
└── yolov8
    └── 1
        ├── config.pbtxt
        └── model.plan

Now the contents of the config.pbtxt is shown as below,

name: "yolov8"
platform: "tensorrt_plan"
max_batch_size: 10
input {
  name: "images"
  data_type: TYPE_FP32
  dims: 3
  dims: 640
  dims: 640
}
output {
  name: "num_dets"
  data_type: TYPE_INT32
  dims: 1
}
output {
  name: "bboxes"
  data_type: TYPE_FP32
  dims: 100
  dims: 4
}
output {
  name: "scores"
  data_type: TYPE_FP32
  dims: 100
}
output {
  name: "labels"
  data_type: TYPE_INT32
  dims: 100
}
instance_group [
  {
  count: 1
  kind: KIND_GPU
  gpus: [0]
  }
]

default_model_filename: "/home/swap/Video-Analytics/weights/tensorrt/yolov8/1/model.plan"

but after this it was giving below errer,

E1018 06:51:48.501950 65439 model_lifecycle.cc:621] failed to load 'yolov8' version 1: Invalid argument: unable to find 'libtriton_tensorrt.so' for model 'yolov8', searched: /home/swap/Video-Analytics/weights/tensorrt/yolov8/1, /home/swap/Video-Analytics/weights/tensorrt/yolov8, /opt/tritonserver/backends/tensorrt
ERROR: infer_trtis_server.cpp:1051 Triton: failed to load model yolov8, triton_err_str:Invalid argument, err_msg:load failed for model 'yolov8': version 1 is at UNAVAILABLE state: Invalid argument: unable to find 'libtriton_tensorrt.so' for model 'yolov8', searched: /home/swap/Video-Analytics/weights/tensorrt/yolov8/1, /home/swap/Video-Analytics/weights/tensorrt/yolov8, /opt/tritonserver/backends/tensorrt;

ERROR: infer_trtis_backend.cpp:44 failed to load model: yolov8, nvinfer error:NVDSINFER_TRITON_ERROR
ERROR: infer_trtis_backend.cpp:181 failed to initialize backend while ensuring model:yolov8 ready, nvinfer error:NVDSINFER_TRITON_ERROR
0:00:00.346019430 65439 0x5636f7bcec90 ERROR          nvinferserver gstnvinferserver.cpp:361:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in createNNBackend() <infer_trtis_context.cpp:247> [UID = 1]: failed to initialize triton backend for model:yolov8, nvinfer error:NVDSINFER_TRITON_ERROR
0:00:00.362714364 65439 0x5636f7bcec90 ERROR          nvinferserver gstnvinferserver.cpp:361:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in initialize() <infer_base_context.cpp:79> [UID = 1]: create nn-backend failed, check config file settings, nvinfer error:NVDSINFER_TRITON_ERROR

So, from above error the deepstream when utilizing the nv-inferserver plugin for triton inference is not able to find the libtriton_tensorrt.so lib , so i pulled and installed the triton-inference-server/tensorrt_backend and installed it to /opt/tritonserver/backends/tensorrt but now it’s giving below error,

E1018 06:53:17.071849 66350 model_lifecycle.cc:621] failed to load 'yolov8' version 1: Unsupported: triton backend API version does not support this backend
ERROR: infer_trtis_server.cpp:1051 Triton: failed to load model yolov8, triton_err_str:Invalid argument, err_msg:load failed for model 'yolov8': version 1 is at UNAVAILABLE state: Unsupported: triton backend API version does not support this backend;

ERROR: infer_trtis_backend.cpp:44 failed to load model: yolov8, nvinfer error:NVDSINFER_TRITON_ERROR
ERROR: infer_trtis_backend.cpp:181 failed to initialize backend while ensuring model:yolov8 ready, nvinfer error:NVDSINFER_TRITON_ERROR
0:00:00.201816099 66350 0x56388cfc9090 ERROR          nvinferserver gstnvinferserver.cpp:361:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in createNNBackend() <infer_trtis_context.cpp:247> [UID = 1]: failed to initialize triton backend for model:yolov8, nvinfer error:NVDSINFER_TRITON_ERROR
0:00:00.284064063 66350 0x56388cfc9090 ERROR          nvinferserver gstnvinferserver.cpp:361:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in initialize() <infer_base_context.cpp:79> [UID = 1]: create nn-backend failed, check config file settings, nvinfer error:NVDSINFER_TRITON_ERROR

can you find libtriton_tensorrt.so in the device?

As i pointed out in the previous response, i pulled and installed the triton-inference-server/tensorrt_backend to /opt/tritonserver/backends/tensorrt and libtriton_tensorrt.so is in the same place.

please try export LD_LIBRARY_PATH=/opt/tritonserver/backends/tensorrt:$LD_LIBRARY_PATH

That doesn’t seems to be working either, as i mentioned earlier libtriton_tensorrt.so is found correctly but the main issue it’s giving is as below,

E1018 08:22:28.686172 103075 model_lifecycle.cc:621] failed to load 'yolov8' version 1: Unsupported: triton backend API version does not support this backend
ERROR: infer_trtis_server.cpp:1051 Triton: failed to load model yolov8, triton_err_str:Invalid argument, err_msg:load failed for model 'yolov8': version 1 is at UNAVAILABLE state: Unsupported: triton backend API version does not support this backend;

the directory structure is wrong. config.pbtxt is outside of 1. please refer to sample and script

That’s not working either, it looks like problem with the triton build process, as am trying to rebuild the triton version that is compatible with the deepstream-6.1!

I tried rebuilding the triton-inference-server/tensorrt_backend repo with appropriate tags for it’s git submodules , and now it gives segmentation fault, i set the triton log_level: 2 in the config_yoloV8_triton.txt config file,


(ds-app:32793): GLib-GObject-WARNING **: 01:14:09.574: g_object_set_is_valid_property: object class 'GstNvInferServer' has no property named 'input-tensor-meta'
** INFO: <create_primary_gie_bin:145>: gpu-id: 0 in primary-gie group is ignored, only accept in nvinferserver's config
0:00:00.108796130 32793 0x55dc8bca84c0 WARN           nvinferserver gstnvinferserver_impl.cpp:287:validatePluginConfig:<primary_gie> warning: Configuration file batch-size reset to: 1
I1018 19:44:09.686065 32793 pinned_memory_manager.cc:241] Pinned memory pool is created at '0x7f0518000000' with size 268435456
I1018 19:44:09.686194 32793 cuda_memory_manager.cc:107] CUDA memory pool is created on device 0 with size 67108864
I1018 19:44:09.686649 32793 server.cc:604] 
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I1018 19:44:09.686660 32793 server.cc:631] 
+---------+------+--------+
| Backend | Path | Config |
+---------+------+--------+
+---------+------+--------+

I1018 19:44:09.686666 32793 server.cc:674] 
+-------+---------+--------+
| Model | Version | Status |
+-------+---------+--------+
+-------+---------+--------+

I1018 19:44:09.698295 32793 metrics.cc:810] Collecting metrics for GPU 0: NVIDIA GeForce RTX 3050 Laptop GPU
I1018 19:44:09.699899 32793 metrics.cc:703] Collecting CPU metrics
I1018 19:44:09.699982 32793 tritonserver.cc:2435] 
+----------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                            |
+----------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                           |
| server_version                   | 0.0.0                                                                                                                                            |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_share |
|                                  | d_memory binary_tensor_data parameters statistics logging                                                                                        |
| model_repository_path[0]         | /home/swap/ultraanalytics/Video-Analytics/weights/tensorrt                                                                                       |
| model_control_mode               | MODE_EXPLICIT                                                                                                                                    |
| strict_model_config              | 0                                                                                                                                                |
| rate_limit                       | OFF                                                                                                                                              |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                        |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                         |
| min_supported_compute_capability | 6.0                                                                                                                                              |
| strict_readiness                 | 1                                                                                                                                                |
| exit_timeout                     | 30                                                                                                                                               |
| cache_enabled                    | 0                                                                                                                                                |
+----------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+

I1018 19:44:09.700556 32793 model_lifecycle.cc:461] loading: yolov8:1
I1018 19:44:09.704425 32793 tensorrt.cc:5427] TRITONBACKEND_Initialize: tensorrt
I1018 19:44:09.704439 32793 tensorrt.cc:5437] Triton TRITONBACKEND API version: 1.15
I1018 19:44:09.704442 32793 tensorrt.cc:5443] 'tensorrt' TRITONBACKEND API version: 1.10
I1018 19:44:09.704492 32793 tensorrt.cc:5486] backend configuration:
{"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}}
I1018 19:44:09.704623 32793 tensorrt.cc:5538] TRITONBACKEND_ModelInitialize: yolov8 (version 1)
I1018 19:44:09.979876 32793 logging.cc:49] [MemUsageChange] Init CUDA: CPU +466, GPU +0, now: CPU 871, GPU 765 (MiB)
Segmentation fault (core dumped)

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

  1. there is no default_model_filename in the model config.pbtxt, please refer to the sample above.
  2. please install triton according to this doc, we usually use the DeepStream triton docker.
  3. could you use gdb to get the crash stack? if crashed in triton, You could try
    asking in the triton Github.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.