Question about input layer to extract bodypose25d

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc) RTX GPU
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) BodyPose2d
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

Hi. I would like to ask a question referring to the code below.
My purpose was to parse only the 2.5d output layer, map it from 2d to 3d using the intrinsic parameters of my Rtsp camera obtained in advance (camera calibration), and then proceed with the Pose classification task.
So, in a way, the output directly used is pose25d, but the input stage is In addition to the image, do the four parameters need to be accurate? I think pose25d is only used with heatmap information anyway.

Moving to DS forum for tracking.

Thank you. help me anyone

#include "nvdsinfer_custom_impl.h"
#include <cstring>
/* Assumes only one input layer "im_info" needs to be initialized */
bool NvDsInferInitializeInputLayers (std::vector<NvDsInferLayerInfo> const &inputLayersInfo,
        NvDsInferNetworkInfo const &networkInfo,
        unsigned int maxBatchSize)
{
//   float scale_normalized_mean_limb_lengths[] = {
//      0.5000, 0.5000, 1.0000, 0.8175, 0.9889, 0.2610, 0.7942, 0.5724, 0.5078,
//      0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.3433, 0.8171,
//      0.9912, 0.2610, 0.8259, 0.5724, 0.5078, 0.0000, 0.0000, 0.0000, 0.0000,
//      0.0000, 0.0000, 0.0000, 0.3422, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000};
//  float mean_limb_lengths[] =  {
//      246.3427, 246.3427, 492.6854, 402.4380, 487.0321, 128.6856, 391.6295,
//      281.9928, 249.9478,   0.0000,   0.0000,   0.0000,   0.0000,   0.0000,
//      0.0000,   0.0000, 169.1832, 402.2611, 488.1824, 128.6848, 407.5836,
//      281.9897, 249.9489,   0.0000,   0.0000,   0.0000,   0.0000,   0.0000,
//      0.0000,   0.0000, 168.6137,   0.0000,   0.0000,   0.0000,   0.0000,
//      0.0000};

//   //k_inv would change for camera parameters
//   float k_inv[] = {0.00124876620338,   0,                -0.119881555525,
//                    0,                  0.00124876620338, -0.159842074033,
//                    0,                  0,                 1};

//   float t_form_inv[] = {1.0, 0.0, 0.0,
//                         0.0, 1.0, 0.0,
//                         0.0, 0.0, 1.0};
//   for (auto v : inputLayersInfo){
//     if (!strcmp(v.layerName, "scale_normalized_mean_limb_lengths")){
//       memcpy(v.buffer,scale_normalized_mean_limb_lengths,sizeof(float)*36);
//     }
//     if (!strcmp(v.layerName, "mean_limb_lengths")){
//       memcpy(v.buffer,mean_limb_lengths,sizeof(float)*36);
//     }
//     if (!strcmp(v.layerName, "k_inv")){
//       memcpy(v.buffer,k_inv,sizeof(float)*9);
//     }
//     if (!strcmp(v.layerName, "t_form_inv")){
//       memcpy(v.buffer,t_form_inv,sizeof(float)*9);
//     }
//   }

  return true;
}

Even though I commented and ran the code as above, the pose keypoints came out fine and it worked fine even if I created a new model.

So it seems like a largely unrelated parameter… but can I know when to use it? And does the NvDsInferInitializeInputLayers function run at runtime? Or, I wonder if it only works when creating a model.

It’s just used for 3d output layer. We use the 25d output layer in our demo. So don’t have to worry about the configuration of these parameters.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.