Triton deployment and inference


I’m trying to run inference on Triton Inference Server with faster rcnn model trained with TLT, converted to TensorRT.

In docs I can see:

Note that the models can also be deployed outside of DeepStream using TensorRT but users will need to do image pre-processing and post-process the output Tensor after inference.

What kind of pre-process and post-process should I use for FasterRCNN and other available object detection models?
Currently I have OpenCV’s Mat (8UC3)-> 32FC3 → channel-wise array (3, width, height) (RRRGGGBBB) → pack data into bytestring and send request.

Then I parse output, but NMS_1 always equals zero.

Triton generated this config:

In TLT 3.0, for post-processing, please refer to deepstream_tao_apps/post_processor at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub. For pre-processing, please refer to

In TLT 2.0, for post-processing, please refer to

Reference topics:
For classification network,

For detectnet_v2 network,

Hi to all,
I was able to launch the INCEPTION SSD network plan file via tritonserver and write a python client to query it. However while I can get the correct output for NMS for the bboxes, NMS_1 (Keepcount) always gives me zero value. How is it possible ? The weird thing is that if I run the example script of tensorrt which doesn’t use the triton server I get the correct output of 100. thanks in advance

Hi rob91,

Please help to open a new topic if it’s still an issue. Thanks