Running Pose Estimation with Deepstream

I am interested in running this tutorial with my current deepstream pipeline:
https://docs.nvidia.com/isaac/isaac/packages/skeleton_pose_estimation/doc/2Dskeleton_pose_estimation.html

Deepstream requires a .engine file but even after that a custom plugin to parse the output of the model to display the necessary inference data. Are there any resources for this integration?

Hello there,
what is your pipeline line ? do you use GStreamer ?
for exeample this is my pipeline so very easy to use without custom .engine

"pipeline": {
        "pipeline": "gst-launch-1.0 -v v4l2src device=/dev/video0  ! avdec_mjpeg ! videoconvert ! video/x-raw,format=RGB,height=480,wight=1280,framerate=60/1 ! appsink name = acquired_image "
       },

Hello, I am currently using deepstream with a .engine file. Would I need to convert it to a .engine file first?

Well you probably could use a .engine but I think that it would be more strait forward using Gstreamer like I did, in this case no need to use a .engine and your code can be only 1 line long in the config section of the JSON file.

I see, thanks for the response. In what format are you storing your model. How are you having g streamer communicate with the the model? Do you have tutorials where I can do this process from start to end, I currently have a Jetson Nano.

In your gstreamer pipeline what would be your sink? Would your sink be the input to your model?

I use a V4L2 camera, my camera use MJPEG so I convert MJPEG to RGB with Gstreamer in the pipeline, my camera is stereo so I crop the image to get only one view with cuda crop, I implemented a RGB to Gray function that you don’t need, then I just send the image to the skeleton algo, after that I use a cpp codelet to get the output bones.

This is my JSON file :

{
  "name": "openpose_inference",
  "modules": [
    "//packages/skeleton_pose_estimation/apps/openpose:PoseRX",
    "//packages/skeleton_pose_estimation/apps/openpose:rgb2grey",
    "//packages/ml:ml",
    "//packages/ml:tensorrt",
    "//packages/skeleton_pose_estimation",
    "//packages/perception",
    "sensors:v4l2_camera",
    "deepstream",
    "message_generators",
    "viewers",
    "deepstream",
    "sight"
  ],
	
  "graph": {
    "nodes": [
	{
        "name": "poseRX",
        "components": [
          {
            "name": "message_ledger",
            "type": "isaac::alice::MessageLedger"
          },
          {
            "name": "poseRX",
            "type": "isaac::PoseRX"
          }
        ]
      },
      {
        "name": "feeder",
        "components": [
          {
            "name": "message_ledger",
            "type": "isaac::alice::MessageLedger"
          },
          {
            "name": "image_feeder",
            "type": "isaac::message_generators::ImageLoader"
          },
	  {
            "name": "pipeline",
            "type": "isaac::deepstream::Pipeline"
          }
	]
       },
        {
        "name": "compute",
        "components": [
          {
            "name": "message_ledger",
            "type": "isaac::alice::MessageLedger"
          },
          {
            "name": "crop1",
            "type": "isaac::perception::CropAndDownsampleCuda"
          },
	  {
            "name": "crop2",
            "type": "isaac::perception::CropAndDownsampleCuda"
          },
	  {
            "name": "RGB2GRAY",
            "type": "isaac::opencv::RGB2GREY"
          }
        ]
      },
      {
        "name": "tensor_encoder",
        "components": [
          {
            "name": "message_ledger",
            "type": "isaac::alice::MessageLedger"
          },
          {
            "name": "isaac.ml.ColorCameraEncoderCuda",
            "type": "isaac::ml::ColorCameraEncoderCuda"
          }
        ]
      },
      {
        "name": "tensor_r_t_inference",
        "components": [
          {
            "name": "message_ledger",
            "type": "isaac::alice::MessageLedger"
          },
          {
            "name": "isaac.ml.TensorRTInference",
            "type": "isaac::ml::TensorRTInference"
          }
        ]
      },
      {
        "name": "open_pose_decoder",
        "components": [
          {
            "name": "message_ledger",
            "type": "isaac::alice::MessageLedger"
          },
          {
            "name": "isaac.skeleton_pose_estimation.OpenPoseDecoder",
            "type": "isaac::skeleton_pose_estimation::OpenPoseDecoder"
          }
        ]
      },
      {
        "name": "skeleton_viewer",
        "components": [
          {
            "name": "isaac.alice.MessageLedger",
            "type": "isaac::alice::MessageLedger"
          },
          {
            "name": "isaac.viewers.SkeletonViewer",
            "type": "isaac::viewers::SkeletonViewer"
          },
          {
            "name": "isaac.viewers.ColorCameraViewer",
            "type": "isaac::viewers::ColorCameraViewer"
          }
        ]
      },
       {
        "name": "publishing",
        "components": [
          {
            "name": "message_ledger",
            "type": "isaac::alice::MessageLedger"
          },
          {
            "name": "viewer1",
            "type": "isaac::viewers::ColorCameraViewer"
          },
          {
            "name": "viewer2",
            "type": "isaac::viewers::ColorCameraViewer"
          },
	 {
            "name": "viewer3",
            "type": "isaac::viewers::ColorCameraViewer"
          },
	  {
            "name": "viewer4",
            "type": "isaac::viewers::ColorCameraViewer"
          },
          {
            "name": "viewer1_widget",
            "type": "isaac::sight::SightWidget"
          },
          {
            "name": "viewer2_widget",
            "type": "isaac::sight::SightWidget"
          },
          {
            "name": "viewer3_widget",
            "type": "isaac::sight::SightWidget"
          },
	  {
            "name": "viewer4_widget",
            "type": "isaac::sight::SightWidget"
          }	
        ]
      }
    ],
    "edges": [
       {
        "source": "open_pose_decoder/isaac.skeleton_pose_estimation.OpenPoseDecoder/skeletons",
        "target": "poseRX/poseRX/skeletons"
      },
       {
        "source": "feeder/pipeline/acquired_image",
        "target": "compute/crop1/input_image"
      },
      {
        "source": "feeder/pipeline/acquired_image",
        "target": "compute/crop2/input_image"
      },
      {
        "source": "feeder/pipeline/acquired_image",
        "target": "compute/RGB2GRAY/input_image"
      },
     {
	"source": "compute/crop1/output_image",
        "target": "tensor_encoder/isaac.ml.ColorCameraEncoderCuda/rgb_image"
      },
      {
        "source": "tensor_encoder/isaac.ml.ColorCameraEncoderCuda/tensor",
        "target": "tensor_r_t_inference/isaac.ml.TensorRTInference/input"
      },
      {
        "source": "tensor_r_t_inference/isaac.ml.TensorRTInference/part_affinity_fields",
        "target": "open_pose_decoder/isaac.skeleton_pose_estimation.OpenPoseDecoder/part_affinity_fields"
      },
      {
        "source": "tensor_r_t_inference/isaac.ml.TensorRTInference/gaussian_heatmap",
        "target": "open_pose_decoder/isaac.skeleton_pose_estimation.OpenPoseDecoder/gaussian_heatmap"
      },
      {
        "source": "tensor_r_t_inference/isaac.ml.TensorRTInference/maxpool_heatmap",
        "target": "open_pose_decoder/isaac.skeleton_pose_estimation.OpenPoseDecoder/maxpool_heatmap"
      },
      {
        "source": "open_pose_decoder/isaac.skeleton_pose_estimation.OpenPoseDecoder/skeletons",
        "target": "skeleton_viewer/isaac.viewers.SkeletonViewer/skeletons"
      },
      {
        "source": "compute/crop1/output_image",
        "target": "skeleton_viewer/isaac.viewers.ColorCameraViewer/color_listener"
      },
      {
        "source": "compute/crop1/output_image",
        "target": "publishing/viewer1/color_listener"
      },
      {
       "source": "compute/crop2/output_image",
        "target": "publishing/viewer2/color_listener"
      },
      {
        "source": "feeder/pipeline/acquired_image",
        "target": "publishing/viewer3/color_listener"
      },
      {
        "source": "compute/RGB2GRAY/output_image",
        "target": "publishing/viewer4/color_listener"
      }
    ]
  },
  "config": {
      "poseRX" : {
      "poseRX" : {
        "LogLvl" : "1"
      }
    },
      "compute": {
      "crop1": {
        "crop_start": [0, 0],
        "crop_size": [480, 640]
       },
      "crop2": {
        "crop_start": [0, 640],
        "crop_size": [480, 640]
       }
      },
      "publishing": {
      "viewer1_widget": {
        "title": "Viewer: Video Left",
        "type": "2d",
        "channels": [
          {
            "name": "publishing/viewer1/Color"
          }
        ]
      },
      "viewer2_widget": {
        "title": "Viewer: Video Right",
        "type": "2d",
        "channels": [
          {
            "name": "publishing/viewer2/Color"
          }
        ]
      },
       "viewer3_widget": {
        "title": "Viewer: Acquis ",
        "type": "2d",
        "channels": [
          {
            "name": "publishing/viewer3/Color"
          }
        ]
      },
      "viewer4_widget": {
        "title": "Viewer: GRAY",
        "type": "2d",
        "channels": [
          {
            "name": "publishing/viewer4/Color"
          }
        ]
      }
    },
    "tensor_encoder": {
      "isaac.ml.ColorCameraEncoderCuda": {
        "rows": 320,
        "cols": 320,
        "pixel_normalization_mode": "HalfAndHalf",
        "tensor_index_order": "201",
        "keep_aspect_ratio": false
      }
    },
    "feeder": {
       "pipeline": {
        "pipeline": "gst-launch-1.0 -v v4l2src device=/dev/video0  ! avdec_mjpeg ! videoconvert ! video/x-raw,format=RGB,height=480,wight=1280,framerate=60/1 ! appsink name = acquired_image "
       },
      "image_feeder": {
        "color_filename": "packages/skeleton_pose_estimation/apps/openpose/validation_dataset/images/01.png",
        "tick_period": "1Hz",
        "focal_length": [
          100,
          100
        ],
        "optical_center": [
          500,
          500
        ],
        "distortion_coefficients": [
          0.01,
          0.01,
          0.01,
          0.01,
          0.01
        ]
      }
    },
    "tensor_r_t_inference": {
      "isaac.ml.TensorRTInference": {
        "model_file_path": "external/openpose_model/ix-networks-openpose.uff",
        "engine_file_path": "external/openpose_model/ix-networks-openpose.plan",
        "input_tensor_info": [
          {
            "channel": "input",
            "operation_name": "input_1",
            "dims": [3, 320, 320],
            "uff_input_order": "channels_first"
          }
        ],
        "output_tensor_info": [
          {
            "channel": "part_affinity_fields",
            "operation_name": "lambda_2/conv2d_transpose",
            "dims": [160, 160, 38]
          },
          {
            "channel": "gaussian_heatmap",
            "operation_name": "lambda_3/tensBlur_depthwise_conv2d",
            "dims": [160, 160, 19]
          },
          {
            "channel": "maxpool_heatmap",
            "operation_name": "tensBlur/MaxPool",
            "dims": [160, 160, 19]
          },
          {
            "channel": "heatmap",
            "operation_name": "lambda_1/conv2d_transpose",
            "dims": [160, 160, 19]
          }
        ]
      }
    },
    "open_pose_decoder": {
      "isaac.skeleton_pose_estimation.OpenPoseDecoder": {
        "label": "Human",
        "labels": ["Nose", "Neck", "Rsho", "Relb", "Rwri", "Lsho", "Lelb", "Lwri", "Rhip", "Rkne",
                   "Rank", "Lhip", "Lkne", "Lank", "Leye", "Reye", "Lear", "Rear"],

        "edges": [[1, 2], [1, 5], [2, 3], [3, 4], [5, 6], [6, 7], [1, 8], [8, 9], [9, 10], [1, 11],
                [11, 12], [12, 13], [1, 0], [0, 14], [14, 16], [0, 15], [15, 17], [2, 16], [5, 17]],

        "edges_paf": [[12, 13], [20, 21], [14, 15], [16, 17], [22, 23], [24, 25], [0, 1], [2, 3],
                      [4, 5], [6, 7], [8, 9], [10, 11], [28, 29], [30, 31], [34, 35], [32, 33],
                      [36, 37], [18, 19], [26, 27]],

        "threshold_heatmap" : 0.05,
        "threshold_edge_size" : 0.1,
        "threshold_edge_score" : 0.05,
        "threshold_edge_sampling_counter" : 8,
        "threshold_part_counter" : 4,
        "threshold_object_score" : 0.4,
        "threshold_split_score" : 2,
        "edge_sampling_steps" : 10,
        "refine_parts_coordinates" : true,

        "output_scale" : [480, 640]
      }
    },
    "skeleton_viewer": {
      "isaac.viewers.SkeletonViewer": {
        "labels": ["Nose", "Neck", "Rsho", "Relb", "Rwri", "Lsho", "Lelb", "Lwri", "Rhip", "Rkne",
                  "Rank", "Lhip", "Lkne", "Lank", "Leye", "Reye", "Lear", "Rear"],

        "edges_render": [[1, 2], [1, 5], [2, 3], [3, 4], [5, 6], [6, 7], [1, 8], [8, 9], [9, 10],
                         [1, 11], [11, 12], [12, 13], [1, 0], [0, 14], [14, 16], [0, 15], [15, 17]]
      },
      "isaac.viewers.ColorCameraViewer": {
        "camera_name": "color_camera_viewer",
        "target_fps": 30,
        "reduce_scale": 1
      }
    }
  }
}

This is to get the skeleton output :

#include "poseRX.hpp"

#include <string>

namespace isaac {

void PoseRX::start() {
  // By using tickOnMessage instead of tickPeriodically we instruct the codelet to only tick when
  // a new message is received on the incoming data channel `trigger`.
  tickOnMessage(rx_skeletons());
}

void PoseRX::tick() {
  // This function will now only be executed whenever we receive a new message. This is guaranteed
  // by the Isaac Robot Engine.

  // Parse the message we received
  // create cap'n Proto List of Skeleton2Proto
    auto skeleton_list = rx_skeletons().getProto().getSkeletons();
  
    for(size_t i = 0; i < skeleton_list.size(); i++) {
  
      // for each Skeleton2Proto, create cap'n Proto List of Joint2Proto
      auto joint_list = skeleton_list[i].getJoints();

      for(size_t ii = 0; ii < joint_list.size(); ii++) {
        
         // retrieve joint prediction confidence, this is a float
         float joint_conf = joint_list[ii].getLabel().getConfidence();
	 std::printf("conf :  %f    |    ",joint_conf);
         // retrieve X position in image of joint (be warned, this is a float too)
         float joint_x = joint_list[ii].getPosition().getX();
	 std::printf("x :  %f    |    ",joint_x);
         // retrieve Y position in image of joint (be warned, this is a float too)
         float joint_y = joint_list[ii].getPosition().getY();
	 std::printf("y :  %f   \n",joint_y);
       }
     }
  // Print to the console
  //std::printf("%s:", message.c_str());
   //std::printf(" DOPE! \n");
}

}  // namespace isaac

Hope that can guide you, tell me if you need more details

Thank you very much, I am going use your feedback and try this out on the Jetson Nano.

1 Like

Hello @ishan, did it worked ?
Kind regards,
Planktos

Hello Planktos,

I am still working on integrating directly with deepstream - deepstream requires you to write this custom plugin to interact with trt pose.

Currently, the process is a bit difficult for that, so lets see how it goes.

Hey everyone!

I was able to get DeepStream working with TRTPose. See if this helps you out!

1 Like