I refer to this example ( the results obtained are as follows,model is resnet18_baseline_att_224x224_A_epoch_249.onnx converted by resnet18_baseline_att_224x224_A_epoch_249.pth . The white circle is the point pose.

config is :

the pose is not good. what should i do?

**• Hardware Platform (Jetson / GPU)**xavier
• DeepStream Version5.0
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

I had the same problem using Jetson tx2
Also tested with Tesla V100 and couln’t generate the engine.

I also am experiencing the exact same issue on a Xavier, does this code work at all?
Would be great to get some feedback on where I am going wrong with this.


Not fixed yet


I am the author of the code and would love to help you out. I am trying to reproduce this problem but am not able to. Can I ask what model (DenseNet/ResNet) and what sample video you’re using ?

It is working for me. I was able to use the Isaac export script for the model.

I tried reproducing the problem by git cloning the repository from scratch onto a Jetson Xavier NX but was not able to reproduce the problem.

Is this how you are running the app?
sudo ./deepstream-pose-estimation-app ../../../../samples/streams/sample_720p.h264 /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream_pose_estimation_master/

  1. Are you not getting a human pose at all? In some of the sample streams since the faces are blurred - it is possible that the model just might be misclassifying some of the points and is not able to efficiently generate relationships for detected body parts between the face and say, the shoulders, and thus not drawing a human pose at all. But you should definitely have a pose drawn onto the video for most frames. Try using a different h264 encoded stream to see if the problem persists?

  2. Maybe your stream isn’t being decoded properly if you’re using a mp4 file. Try encoding a different video file as a h264 stream to see if the problem persists? One easy way to do this is to use ffmpeg.
    sudo apt install ffmpeg
    ffmpeg -i input.mp4 -vcodec copy -an -bsf:v h264_mp4toannexb output.h264

  3. It is also possible that your model can’t be parsed properly. Did you use the isaac export script to generate the ONNX file?

1 Like

Upon further inspection, it seems I can only reproduce your result when the model is not being converted to ONNX successfully. Does the TRT engine show this information upon inferencing when you run the app?

INFO: [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT input.1 3x224x224
1 OUTPUT kFLOAT 262 18x56x56
2 OUTPUT kFLOAT 264 42x56x56

I got this working. Works like a charm. Perfect. Great work.
For those experiencing problems with inference please use this modified onnx file -

1 Like

Glad you were able to get it working @robwuvvv! Please let me know if you have any more issues!


Thank you. There should be a problem with my onnx model, and it can be used now.

when i use trt_pose model resnet18_baseline_att_224x224_A_epoch_249.pth to convert to onnx model,then i got the same wrong result, can you provide your trained model resnet18_baseline_att_224x224_A_epoch_249.pth,and provide the convert file, i can not use trt_pose to get right onnx model @anujsaharan, after deepstream read model,my model output is channel last like 56x56x42 56x56x18, although i change the channel,the onnx model result is not right

Can i use these for RTSP Stream?
if is possible, how can i do that?
give me some advice…

Hi @anujsaharan! Great work, thanks. I was able to run the code with the model you provided without any issue. But converting from trt_pose models outputs a similar result with RayZhang’s output frame. I’m converting with without any issues but deepstream_pose_estimation reports the output differently:
$ python ./ --input_checkpoint resnet18_baseline_att_224x224_A_epoch_249.pth

    0   INPUT  kFLOAT input           3x224x224       min: 1x3x224x224     opt: 1x3x224x224     Max: 1x3x224x224     
    1   OUTPUT kFLOAT part_affinity_fields 56x56x42        min: 0               opt: 0               Max: 0               
    2   OUTPUT kFLOAT heatmap         56x56x18        min: 0               opt: 0               Max: 0               
    3   OUTPUT kFLOAT maxpool_heatmap 56x56x18        min: 0               opt: 0               Max: 0 

I’m running on x86_64 and compiling and running the app inside
Any idea what could be the problem? Thanks!

Hi @anujsaharan !
I have a same problem like RayZhang.
The problem occurs when I convert/use models obtained from “” in trt_pose.
However, your model “pose_estimation.onnx” in “deepstream_pose_estimation” works fine.
Could you tell me how to make your default “pose_estimation.onnx” ? Thansk!