Run deepstream with Retiface model but got wrong output with different input shape image

SonTV · March 30, 2021, 1:55am

Hi support team,

I’m using deepstream-test 3 python example to run RetinaFace resnet50 model.
I get this model from repo: GitHub - biubug6/Pytorch_Retinaface: Retinaface get 80.99% in widerface hard val using mobilenet0.25.
Link download: Retinaface_model_v2 - Google Drive (file Resnet50_Final.pth)

Base on convert_to_onnx.py file in github repo, I converted .pth file to onnx file without error.

To run deepstream, I convert onnx file to .engine file with –explicit _batch param:

/usr/src/tensorrt/bin/trtexec --explicitBatch --onnx=FaceDetector_3x720x1280.onnx --minShapes=input:1x3x720x1280 --optShapes=input:32x3x720x1280 --maxShapes=input:32x3x720x1280 --saveEngine=model.batch1-32-720x1280.engine

I write a custom code for pad probe function after PGIE to do some postprocessing, convert tensorrt model outputs into coordinates of bounding boxes. But the final tensorrt bboxes output seem to unmatch the onnx model output. Below are some first rows of outputs coordinates in format [4 bboxes] [1 confidence]


ONNX input image: 1280x720

-> bboxes, conf: 
(180.20432, 376.52902, 248.42604, 457.669, 0.99970335)
(652.8903, 399.71368, 718.46405, 473.9985, 0.99964297)
(814.8349, 117.745026, 880.6983, 196.94049, 0.99956053)
(124.28804, 516.43823, 187.45502, 590.3951, 0.99950266)
(872.94556, 454.53992, 932.13556, 534.32825, 0.99934095)
(953.01324, 281.43137, 1015.52014, 350.7596, 0.9993309)

TensorRT input image: 1280x720
-> bboxes, conf:
(888.7898, 382.60144, 1010.4221, 428.3359, 0.999685)
(1160.0034, 413.8329, 1277.151, 455.6716, 0.99967617)
(604.57526, 391.67694, 724.69055, 436.8149, 0.99958414)
(168.39978, 138.1584, 285.6008, 182.83948, 0.9995832)
(789.9761, 524.51733, 902.11475, 565.9892, 0.9995227)
(129.55733, 302.13498, 241.45831, 341.36035, 0.99936026)
(414.65784, 480.48514, 519.47003, 525.595, 0.9993399)
...

But when we use the image input with the same width and height (eg. 800x800), the results of onnx and deepstream tensorrt model are quite similar.

ONNX input image: 800x800
-> bboxes, conf: 
(175.73035, 15.69736, 218.86465, 83.27333, 0.99593616)
(406.31818, 446.80933, 450.3378, 519.4465, 0.99531084)
(591.0689, 318.55777, 636.0353, 385.14783, 0.9951623)
(85.07462, 326.9187, 130.29005, 398.35226, 0.9948461)
(573.6182, 3.2033205, 617.547, 66.30011, 0.9943182)
(507.7378, 134.5662, 553.2838, 216.61005, 0.9939587)
...

TensorRT input image: 800x800
(176.44965, 15.834936, 219.82768, 83.319435, 0.99557555)
(406.9334, 448.04874, 450.62445, 519.9374, 0.994955)
(85.84751, 327.6398, 131.38907, 398.96964, 0.9947095)
(591.5082, 319.41702, 636.5945, 386.33548, 0.9946694)
(574.3133, 3.5823674, 618.08594, 66.684364, 0.99455625)
(508.14825, 134.8997, 553.753, 216.69969, 0.99305314)
...

So what is the problem here and how to fix it?
Thank you so much!

Link source code: File on MEGA

Environment

• Hardware Platform: Tesla T4
• DeepStream Version: 5.0
• TensorRT Version: 7.2.1
• PyTorch 1.6
• ONNX v6
• NVIDIA GPU: Driver Version 455.32, CUDA Version 11.1
• OS: Ubuntu 18.04

mchi · March 30, 2021, 4:27am

Hi @SonTV ,
could you clarify what these are?
tensorrt bboxes output ?
onnx model output ?
ONNX input image?
TensorRT input image?

SonTV · April 1, 2021, 2:17am

Hi @mchi, sorry for this late response.

Input image of tensorrt and onnx model is the same image: https://github.com/biubug6/Pytorch_Retinaface/blob/master/curve/test.jpg

For onnx model output, I use detect.py file (https://github.com/biubug6/Pytorch_Retinaface/blob/master/detect.py) to load onnx model, draw bounding boxes and save result as an image.

For tensorrt model, I do some steps. I put this model into deepstream pipeline, change width, height config (1280x720 or 800x800), add output-tensor-meta=1 to config file and get raw tensor data from output layer in probe function (after PGIE). In probe function, I make postprocessing (line 104-143 in https://github.com/biubug6/Pytorch_Retinaface/blob/master/detect.py file) to keep final bboxes. When input image is rectangle image (eg. 1280x720), there are many final bboxes output and their coordinates are incorrect. Below is an example output of deepstream pipeline:
https://drive.google.com/file/d/1_WUDfQjjkx-Nm6rTz8YRHz5aDZr_LCxC/view?usp=drivesdk

SonTV · April 1, 2021, 2:36am

Hi @mchi,

I also follow instructions in repo: tensorrtx/retinaface at master · wang-xinyu/tensorrtx · GitHub to build retina_r50.engine file from c++ and successfully.

Everything is ok when I run test image with command: ./retina_r50 -d (as mentioned in README file in repo).

But when I use this engine model in deepstream-test3 python example, I was in a trouble. Detail error was described here: https://drive.google.com/file/d/1_XwK7PkyXSH4xJyK_p4vmx0gEHs03JUp/view?usp=sharing

I found the same error in issue section (INVALID_ARGUMENT: getPluginCreator could not find plugin Decode_TRT version 1 · Issue #37 · wang-xinyu/tensorrtx · GitHub) and author of this repo say that: “his repo is not integrated into deepstream, only calling tensorrt api”. Is that true? and if now, how to fix it?

Thanks.

mchi · April 1, 2021, 9:04am

Hi @SonTV ,
Could you try setting “maintain-aspect-ratio=1” in nvinfer ?

I downloaded the source code - 263.59 MB file on MEGA , could you share the detailed instructions about how to run the code in DS docker and reproduce the issue?

Thanks!

SonTV · April 1, 2021, 9:38am

Hi @mchi ,

I set param “maintain-aspect-ratio=1” in my config file, but it doesn’t solve the problem.

I run test outside docker environment by this command:

python3 deepstream_test_3.py file:///opt/nvidia/deepstream/deepstream-5.0/sources/deepstream_python_apps/apps/deepstream-test3/video_face_retina_torch.mp4

Link to video that I use to test: https://drive.google.com/file/d/1a0HMxGIEBuwNBFHMQ3z6Mtt27xpCua7V/view?usp=sharing

mchi · April 1, 2021, 9:49am

Anything else need to install to run your sample?

# python3 deepstream_test_3.py file:////opt/nvidia/deepstream/deepstream-5.1/sources/deepstream_python_apps/apps/deepstream-cus/video_face_retina_torch.mp4
Traceback (most recent call last):
  File "deepstream_test_3.py", line 28, in <module>
    import gi
ModuleNotFoundError: No module named 'gi'

SonTV · April 1, 2021, 9:54am

Hi @mchi ,

I follow steps in README file here to install libraries: deepstream_python_apps/apps at master · NVIDIA-AI-IOT/deepstream_python_apps · GitHub

mchi · April 1, 2021, 10:08am

Hi @SonTV ,
did you install ‘torch’ ?
are you running the sample in DS docker?

# python3 deepstream_test_3.py file:////opt/nvidia/deepstream/deepstream-5.1/sources/deepstream_python_apps/apps/deepstream-cus/video_face_retina_torch.mp4
Traceback (most recent call last):
  File "deepstream_test_3.py", line 47, in <module>
    from custom_parser import nvds_infer_parse_custom_code
  File "/opt/nvidia/deepstream/deepstream-5.1/sources/deepstream_python_apps/apps/deepstream-cus/custom_parser.py", line 3, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'

SonTV · April 1, 2021, 10:25am

Hi @mchi,

I use Pytorch version 1.6 and I don’t run the sample in DS docker.

mchi · April 2, 2021, 3:16am

Hi @SonTV ,
Here is my findings and suggetions

You are using TRT7.2.1 with DS5.0, but DS5.0 is not compatible with TRT7.2 since DS5.0 was only developed and officially verified with TRT 7.0.
Frankly, I’m not sure if this issue is caused by the mismnatch of DS and TRT, but user should not use this combination.
I tried to convert your onnx model to TRT engine, it failed with TRT7.0, but success on TRT7.2, to deploy your onnx, you need to use DS5.1 + 7.2.2 (https://developer.nvidia.com/deepstream-getting-started).
I tried to run your code on DS5.1 docker (nvcr.io/nvidia/deepstream:5.0.1-20.09-triton) , but run into some failure about torch.

So, I think the first is we need to use the right combination of DS and TRT, and I would sugges to use DS5.1.

SonTV · April 2, 2021, 4:13am

Hi @mchi,

Currently, I cannot update DS5.0 → latest version DS5.1 because some running app requires TRT 7.2.1. I will try it later.

How about my second question (tensorrtx/retinaface project)? Do you have any clue?

mchi · April 2, 2021, 1:55pm

Sorry! Could you recap how to run tensorrrx/retinaface ?

Thanks!

SonTV · April 2, 2021, 2:05pm

Here you are: Run deepstream with Retiface model but got wrong output with different input shape image - #4 by SonTV

tomriddle · April 13, 2021, 9:46pm

Hello, @SonTV I installed all the requirements and ran your code and the code is stuck at Starting pipeline how can I reproduce your error

SonTV · April 15, 2021, 6:44am

Hi @tomriddle , What error occured when you run?

tomriddle · April 16, 2021, 6:50pm

@SonTV I am not getting any error its just stuck at Starting pipeline

mchi · April 28, 2021, 1:06am

Hi @SonTV ,
I think you have got this issue solved, right?

SonTV · April 29, 2021, 1:10am

Hi @mchi,

Yes, I used retina face model that generated by following instruction this repo: tensorrtx/retinaface at master · wang-xinyu/tensorrtx · GitHub.
Now you can close this topic.
Thanks

Topic		Replies	Views
Unable to convert torchvision model to TRT engine compatible with latest version of deepstream on Jetson DeepStream SDK tensorrt , pytorch , jetson , deepstream	10	853	June 6, 2023
Cannot Deploy PyTorch Model on DeepStream TensorRT	9	972	February 13, 2023
Classifier result on onnx doesn't match Deepstream result DeepStream SDK tensorrt , tensorflow , nvbugs , onnx	35	3281	October 2, 2021
NvDsInferLayerInfo not giving expected no. of outputs DeepStream SDK	60	2065	October 12, 2021
Encountered known unsupported method torch.max_pool3d DeepStream SDK	12	1250	October 12, 2021
Failed to used TensorRT Engine file in deepstream DeepStream SDK	16	2719	October 12, 2021
Batch-size with 9 rtsp streams DeepStream SDK hw , cuda , gstreamer	16	1672	October 12, 2021
Regarding doubts about deepstream custom parser for onnx with deepstream batch DeepStream SDK gstreamer , deepstream	5	30	September 14, 2024
Output tensor values are different from Deepstream and keras, python TensorRt implementation DeepStream SDK nvbugs	43	2665	March 29, 2022
DS4.0 with custom onnx working but DS5.0 not TensorRT tensorrt , gstreamer	5	516	January 20, 2021

Run deepstream with Retiface model but got wrong output with different input shape image

Environment

Related topics