Platform: Jetson Xavier NX - JetPack 4.6
I’m developing an application which depends on Human body pose estimation deep learning models. I’m looking for an accurate and lightweight model that I can deploy on an edge computing device such as the Jetson Xavier.
Having tried to run the model in Python/TensorRT, it was advised that I try to deploy the model using Deepstream to improve performance. For more information please refer to: Post.
Model
The model in question is Movenet
which can be downloaded from the TensorFlowHub site.
To build the TensorRT engine required by the NvInfer
plugin in Deepstream, I converted the TF model into TensorRT taking the following steps:
1. tf → onnx (x86)
With the onnx/tensorflow-onnx
conversion tools,
python3 -m tf2onnx.convert --saved-model ./movenet_singlepose_lightning_4/ --output mnli_nchw.onnx --inputs-as-nchw input
The order of the layers must be NCHW for Deepstream therefore I set the argument input-as-nchw
for the input of the network.
2. onnx → TensorRT (Xavier)
/usr/src/tensorrt/bin/trtexec --onnx=mnli_nchw.onnx --saveEngine=mnli_nchw.engine --verbose
Deepstream - Python Bindings
I decided to reference apps/deepstream-test1
from deepstream_python_apps repo. In the modified script input is taken as video in the .H264 format and inference is performeed in each of the frames. A sink probe is used to access the meta data and tensor information given by the Gst-nvinfer
and used to draw the output on the screen using the Gst-nvdsosd
plugin. Please refer to the file:
ds_movenet_pipeline.py (10.9 KB)
The configuration for the inference engine is given by:
ds_pgie_config.txt (2.6 KB)
The output tesor has the dimension [1x17x3] with the first 2 channels of the last dimension being the yx coordinates of the body landmarks (Nose, Left eye, Right Eye… )and the third dimension of the last channel representing the prediction confidence scores. Ref
Note that In the script I split the output of the tensor into 2 arrays, a [17,2] shaped array with the coordinates (xy) and a [17,1] array with the score information.
However the output of the network , as shown below, is wrong.
I was wondering whether someone could help me debug the application so that I can run this model on the Jetson Xavier.
Thanks a lot!