Hello,
This is a follow-on post from a different thread, but thought it made more sense to start a new thread.
I’ve been using the repo at https://github.com/NVIDIA/retinanet-examples to attempt to develop a retinanet-based network to run in Deepstream on a Jetson Xavier AGX.
In short, after going through the pth->onnx->plan-model process, and my results are that the network runs in the deepstream-app but VERY slowly (<1 fps). So, I don’t know if it’s a problem with my model, the deepstream config files, or that the RetinaNet model is just too heavy for the Jetson Xavier.
If you’re interested, here’s a link to download a zip file containing the pth file resulting from the training, the onnx file from the converted pth, and the TRT (plan) file from the onnx->tensorrt conversion.
https://drive.google.com/file/d/1x_sE7eb564NCqIujcmiao1-2IQUFDdZw/view?usp=sharing
Here is the process I followed:
1 - (on Linux Host, inside docker container from retinanet-examples) train network using the code and process from Nvidia/retinanet-examples github repo.
retinanet train face.pth --fine-tune retinanet_rn50fpn.pth --backbone ResNet50FPN --classes 1 --iters 10000 --val-iters 1000 --lr 0.0005 --images /workspace --annotations train.json --val-annotations test.json
2 - (on Linux Host, inside docker container from retinanet-examples) convert the resulting .pth file to onnx using
retinanet export face.pth face.onnx
3 - (on Jetson) export onnx to TRT - THIS TAKES OVER 26 minutes!!!
./export face4.onnx face4.plan
4 - (on Jetson) following instructions in Nvidia/retinanet-examples/Readme, edit deepstream config files, build output processing plugin, and run deepstream-app
Here is the deepstream config file (ds_config_1vid.txt):
# Copyright (c) 2018 NVIDIA Corporation. All rights reserved.
#
# NVIDIA Corporation and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto. Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA Corporation is strictly prohibited.
[application]
enable-perf-measurement=1
perf-measurement-interval-sec=1
[tiled-display]
enable=0
rows=1
columns=1
width=1280
height=720
gpu-id=0
[source0]
enable=1
type=2
num-sources=1
uri=file:/xavier_ssd/sample_1080p_h264.mp4
gpu-id=0
[streammux]
gpu-id=0
batch-size=1
#batched-push-timeout=-1
## Set muxer output width and height
#width=1280
#height=720
width=640
height=480
#cuda-memory-type=1
enable-padding=1
[sink0]
enable=1
type=3
#1=mp4 2=mkv
container=1
#1=h264 2=h265 3=mpeg4
## only SW mpeg4 is supported right now.
codec=1
sync=0
bitrate=80000000
output-file=/xavier_ssd/output.mp4
source-id=0
[sink1]
enable=0
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=1
source-id=0
gpu-id=0
#cuda-memory-type=1
[osd]
enable=1
gpu-id=0
border-width=2
text-size=12
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Arial
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
[primary-gie]
enable=1
gpu-id=0
batch-size=1
gie-unique-id=1
interval=0
labelfile-path=labels_coco.txt
#model-engine-file=/xavier_ssd/face.plan
config-file=infer_config_batch1.txt
Here’s the inference engine config file (infer_config_batch1.txt):
# Copyright (c) 2018 NVIDIA Corporation. All rights reserved.
# NVIDIA Corporation and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto. Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA Corporation is strictly prohibited.
# Following properties are mandatory when engine files are not specified:
# int8-calib-file(Only in INT8)
# Caffemodel mandatory properties: model-file, proto-file, output-blob-names
# UFF: uff-file, input-dims, uff-input-blob-name, output-blob-names
# ONNX: onnx-file
#
# Mandatory properties for detectors:
# parse-func, num-detected-classes,
# custom-lib-path (when parse-func=0 i.e. custom),
# parse-bbox-func-name (when parse-func=0)
#
# Optional properties for detectors:
# enable-dbscan(Default=false), interval(Primary mode only, Default=0)
#
# Mandatory properties for classifiers:
# classifier-threshold, is-classifier
#
# Optional properties for classifiers:
# classifier-async-mode(Secondary mode only, Default=false)
#
# Optional properties in secondary mode:
# operate-on-gie-id(Default=0), operate-on-class-ids(Defaults to all classes),
# input-object-min-width, input-object-min-height, input-object-max-width,
# input-object-max-height
#
# Following properties are always recommended:
# batch-size(Default=1)
#
# Other optional properties:
# net-scale-factor(Default=1), network-mode(Default=0 i.e FP32),
# model-color-format(Default=0 i.e. RGB) model-engine-file, labelfile-path,
# mean-file, gie-unique-id(Default=0), offsets, gie-mode (Default=1 i.e. primary),
# custom-lib-path, network-mode(Default=0 i.e FP32)
#
# The values in the config file are overridden by values set through GObject
# properties.
[property]
gpu-id=0
net-scale-factor=0.017352074
offsets=123.675;116.28;103.53
model-engine-file=/xavier_ssd/face.plan
labelfile-path=labels_coco.txt
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=1
interval=0
gie-unique-id=1
parse-func=0
is-classifier=0
output-blob-names=boxes;scores;classes
parse-bbox-func-name=NvDsInferParseRetinaNet
custom-lib-path=build/libnvdsparsebbox_retinanet.so
#enable-dbscan=1
[class-attrs-all]
threshold=0.5
group-threshold=0
## Set eps=0.7 and minBoxes for enable-dbscan=1
#eps=0.2
##minBoxes=3
#roi-top-offset=0
#roi-bottom-offset=0
detected-min-w=4
detected-min-h=4
#detected-max-w=0
#detected-max-h=0
## Per class configuration
#[class-attrs-2]
#threshold=0.6
#eps=0.5
#group-threshold=3
#roi-top-offset=20
#roi-bottom-offset=10
#detected-min-w=40
#detected-min-h=40
#detected-max-w=400
#detected-max-h=800
Here’s the command to run deepstream:
LD_PRELOAD=libnvdsparsebbox_retinanet.so deepstream-app -c ds_config_1vid.txt
Here’s the output (at least the first part of it):
$ LD_PRELOAD=build/libnvdsparsebbox_retinanet.so deepstream-app -c ds_config_1vid.txt
Unknown key 'parse-func' for group [property]
Opening in BLOCKING MODE
Creating LL OSD context new
Runtime commands:
h: Print this help
q: Quit
p: Pause
r: Resume
**PERF: FPS 0 (Avg)
**PERF: 0.00 (0.00)
** INFO: <bus_callback:189>: Pipeline ready
Opening in BLOCKING MODE
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
** INFO: <bus_callback:175>: Pipeline running
Creating LL OSD context new
NvMMLiteOpen : Block : BlockType = 4
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 4
H264: Profile = 66, Level = 0
**PERF: 0.00 (0.00)
**PERF: 5.15 (5.15)
**PERF: 5.23 (5.18)
**PERF: 5.21 (5.19)
**PERF: 5.18 (5.19)
**PERF: 5.21 (5.19)
**PERF: 5.22 (5.20)
**PERF: 5.18 (5.20)
**PERF: 5.23 (5.20)
**PERF: 5.19 (5.20)
**PERF: 5.18 (5.20)
**PERF: 5.22 (5.20)
**PERF: 5.20 (5.20)
**PERF: 5.20 (5.20)
**PERF: 5.23 (5.20)
**PERF: 5.23 (5.20)
**PERF: 5.09 (5.20)
**PERF: 4.59 (5.17)
**PERF: 5.20 (5.17)
**PERF: FPS 0 (Avg)
**PERF: 5.19 (5.17)
**PERF: 5.24 (5.17)
**PERF: 5.21 (5.17)
**PERF: 5.11 (5.17)
**PERF: 4.91 (5.16)
Bottom Line Question: Why so slow?
Surely this RetinaNet network will run faster than 1 fps on the Xavier!
Thanks for any help you can provide.