How to debug gstnvinfer with custom model?

Please provide complete information as applicable to your setup.
Hi,
I am trying to build a simple pipeline ( appsrc—> gst-nvinfer(detector)—>fakesink) using an custom model (SSH) I had generated the trt engine file and it can do inferernce correctly base on trt inference API. But when i added this model to the pipeline ,I found the result is huge differ from trt inference API.I checked the preprocess in gst-nvinfer(fetch the result of NvBufSurfTransform in gstnvinfer.cpp) ,its normally. I also checked the properties about preprocess and I did not find error.

MY question is: Is there a way to debug? or good orientation for me to locate error? what makes the difference of result between deepstream and trt ?
• Hardware Platform (Jetson / GPU)
• DeepStream Version
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

How could your pipeline ( appsrc—> gst-nvinfer(detector)—>fakesink) without nvstreammux run? Can you eleborate more details about your pipeline and application?

hi Fiona:

sorry for my typo,it‘s appsrc—> streammux—>gst-nvinfer(detector)—>fakesink)

my application is for image inferance, we use a face detector to detect face and draw the bbox on the image. appsrc use to read and decode image then send the decoded image to streammux.
we also used customed postprocess to get bbox, but the keypoint is here: the result in outputlayerinfo is far away from the result from our result from tensorRT inference.


as I mentioned before I have examined the result after NvBufSurfTransform (in gst-nvinfer.cpp) it looks normaly. I also checked the parameters in my model_config_file.But did not find any unusual or differ with our tensorRT inference code.

So I didn’t have good way to do debug temporarily.Do you have some advice or experience for us to reference?

You may refer to DeepStream SDK FAQ - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums, the item 2. [DS5.0GA_Jetson_GPU_Plugin] Dump the Inference Input can dump the input data of tensorRT for you to compare.

ok,let me check

hi Fiona,
thanks for your patch!

this is the original image:

this is the result from our tensorRT inference code

this is the result of NvBufSurfTransform in gstnvinfer:

this the image before input:


it looks abnormal but it not the root cause because when I send this image to our tensorRT inference code as input, it can get normal results:

so,have any ideas?

What is the features of the model? Please describe the input layers and output layers, the pre-processing needed, the nvinfer config file you used. It is better to provide the model too.

Hi Fiona,
the data you needed can download here: link : https://pan.baidu.com/s/1IJRAjl_3fip4OVvr3rBLuQ
key: uxaw

here are some description:
1.our model is similar like: SSH/scripts at master · mahyarnajibi/SSH · GitHub
an detector use to detect face . we removed the m3-detect head to improve the FPS.
ssh_vgg/vgg16_ssh.caffemodel ssh_vgg/vgg16_ssh.prototxt is the original caffe model. the ssh_vgg.engine is converted by us and the gpu we used is gtx1070. and the ssh_pgie_config.txt is the deepstream pgie config file. the input layer named data (3,540,960) output tensor:ssh_cls_prob (2000,1) for confidence score, ssh_boxes(2000,5) for bounding box , and ssh_boxes[:,1:5] is the location of box :[xmin,ymin,xmax,ymax]( Absolute Coordinate)

  1. because this model has an TRT unspported layer(ssh_proposal) we write an plugin using TRT IpluginV2Ext Api . the lib is in ssh_vgg\modelTRT\modelTRT_custom_plugin\lib
    source code:ssh_vgg\modelTRT\modelTRT_custom_plugin\plugins

3.the code of convert tool is in ssh_vgg\modelTRT\modelTRT_custom_plugin

  1. we use TRT python API to test the engine,code is here: ssh_vgg\modelTRT
    you can run it for trt test: python3 inter_main.py -c [config json file] -a [author name] -m “test” -s [the plugin lib path] (need to modify infer_sample_ssh_vgg.json

5.the preprocess is in ssh_vgg\modelTRT\preprocess.py function: resize_normalization_preprocess(aka resize_normal_pre), its very simple, just do image resize->substract the mean values-> dimension transpose(from hwc(cv::Mat dimention format) to chw)

You nvinfer config file and deepstream pipeline details?

6 the ssh_pgie_config.txt is the nvinfer config file.
the content is here:

[property]
gpu-id=0
net-scale-factor=1
offsets=102.9801;115.9465;122.7717
model-engine-file=…/…/…/models/ssh/ssh_vgg.engine
labelfile-path=…/…/…/models/ssh/labels.txt
force-implicit-batch-dim=1
batch-size=1
network-mode=0
num-detected-classes=2
interval=0
gie-unique-id=10
output-blob-names=ssh_boxes;ssh_cls_prob
parse-bbox-func-name=parseSSHBox
custom-lib-path=./libprocesslib.so
model-color-format=1
process-mode=1
network-type=0
maintain-aspect-ratio=1
infer-dims=3;540;960
cluster-mode=4
scaling-filter=1

[class-attrs-0]
pre-cluster-threshold=0.6
[class-attrs-1]
pre-cluster-threshold=0.3

7.deepstream pipeline is simple too just appsrc—>streammux—>pgie—>fakesink
appsrc for sending image, streammux params:image

any updates?

Can you send us your model?

the model in this zip file