BlazePose integration problem to DeepStream 7.0

roman.tolmachov · September 3, 2024, 8:08am

• Hardware Platform (Jetson / GPU) Nvidia Geforce RTX 4070
• DeepStream Version 7.0
• JetPack Version (valid for Jetson only)
• TensorRT Version 8.6.1
• NVIDIA GPU Driver Version (valid for GPU only) 535.161.08
• Issue Type( questions, new requirements, bugs) BlazePose model integration problem.

I integrated and ran the BlazePose model pose estimation to the DeepStream. For this I wrote the nvinfer config:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-color-format=0
onnx-file=…/models/pose_landmark_full.onnx
model-engine-file=…/models/pose_landmark_full.onnx_b1_gpu0_fp32.engine
#int8-calib-file=calib.table
labelfile-path=…/models/labels.txt
batch-size=1
network-mode=0
num-detected-classes=1
interval=0
gie-unique-id=1
process-mode=1
network-type=3
cluster-mode=4
maintain-aspect-ratio=1
symmetric-padding=1
#workspace-size=2000
parse-bbox-instance-mask-func-name=NvDsInferParseBlazePose
custom-lib-path=…/nvdsinfer_custom_impl_Blaze_pose/libnvdsinfer_custom_impl_Blaze_pose.so
output-instance-mask=1
input-tensor-meta=1
infer-dims=3;256;256
debug-level=3
output-tensor-meta=1
output-blob-names=Identity
layer-name=Identity
output-order=1

Also I implemented the plugin where i added the debug info to look at the model result after the inference:

static bool NvDsInferParseCustomBlazePose(std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
    NvDsInferNetworkInfo const& networkInfo, NvDsInferParseDetectionParams const& detectionParams,
std::vector<NvDsInferInstanceMaskInfo>& objectList) {

const NvDsInferLayerInfo* output = nullptr;
for (const auto& layer : outputLayersInfo) {
    if (strcmp(layer.layerName, "Identity") == 0) {
        output = &layer;
        break;
    }
}

if (!output) {
    std::cerr << "ERROR: Could not find the 'Identity' layer in the output" << std::endl;
    return false;
}

if (output->dataType != FLOAT) {
    std::cerr << "ERROR: Unexpected data type. Expected float, but got: " << output->dataType << std::endl;
    return false;
}

const _Float32* outputData = static_cast<const _Float32*>(output->buffer);
const uint channelsSize = output->inferDims.numElements;
const uint netW = networkInfo.width;
const uint netH = networkInfo.height;

std::cout << "Model Output Info:" << std::endl;
std::cout << "Name: " << output->layerName << std::endl;
std::cout << "Shape: (1, " << channelsSize << ")" << std::endl;
std::cout << "Data Type: " << output->dataType << std::endl;
std::cout << "Number of Keypoints: " << channelsSize << std::endl;
std::cout << "Keypoints Data: [[";
for (uint i = 0; i < channelsSize; ++i) {
    std::cout << outputData[i] << " ";
}
std::cout << "]]" << std::endl;

std::vector<NvDsInferInstanceMaskInfo> keypointsInfo = decodeBlazePoseOutput(outputData, channelsSize, netW, netH);

objectList.clear();
objectList = keypointsInfo;

return true;

}

This is result of the inference:

Model Output Info:
Name: Identity
Shape: (1, 195)
Data Type: 0
Number of Keypoints: 195
Keypoints Data: [[133.892 126.812 -126.83 4.87129 6.3456 135.983 123.48 -131.546 4.3541 5.94556 137.22 122.911 -131.498 4.10678 5.7117 138.453 122.39 -131.495 4.08368 5.43323 132.481 124.268 -131.179 4.58841 6.57842 131.341 124.224 -131.197 4.48914 6.65509 130.211 124.183 -131.215 4.80924 6.74207 140.796 118.361 -125.995 3.55969 5.27982 128.281 120.765 -123.668 4.5182 6.89889 135.879 125.66 -121.86 4.50667 5.89907 132.033 126.377 -121.316 4.6821 6.48752 148.604 113.945 -86.4735 4.88566 4.97485 117.902 121.542 -94.9949 5.9602 6.21012 149.473 121.899 -40.7146 -0.246334 5.05674 120.425 138.051 -57.0485 2.22312 6.5247 139.225 126.301 -19.6482 0.12634 5.08594 132.019 134.044 -28.5732 2.74862 8.15707 136.733 128.301 -19.8782 0.153291 5.06046 135.332 134.34 -28.7915 2.339 8.04276 135.16 125.418 -22.7351 0.271172 5.1737 135.974 130.125 -28.4864 2.44404 8.14153 135.554 123.81 -20.2336 0.312834 5.08776 135.61 129.095 -27.1212 2.41778 8.12186 140.181 116.584 2.59204 8.3082 10.0026 123.215 118.374 -2.65591 8.13905 10.0542 145.723 148.643 -0.735217 2.34421 8.44022 120.983 149.365 4.88721 2.19843 8.41207 147.851 176.049 29.1621 3.11148 6.81073 119.852 177.64 37.1354 2.5789 7.27463 146.23 178.335 30.7904 1.83683 6.55026 120.77 179.511 39.4435 1.36546 6.91496 150.221 190.782 -0.928259 2.98934 5.07984 121.377 194.861 10.6385 2.48772 5.62019 131.816 118.067 0.106942 -21.7882 21.0895 133.618 27.7204 0.263672 -21.8095 20.9721 135.753 126.366 0.0233847 -21.7362 4.76915 139.26 126.275 0.167522 -21.7498 4.84723 135.756 131.48 0.0680409 -21.7051 7.30306 131.973 134.014 -0.026127 -21.6361 7.48631 ]]

This result is not correct, because it is not the same with the reference pipeline, that works perfect.
This is the correct result after the inference in the reference pipeline:
Model Output Info:
Name: Identity
Shape: (1, 195)
Data Type: float32
Data: [[ 1.34127411e+02 8.51423035e+01 -7.72913742e+01 6.73996162e+00
6.12476158e+00 1.35687820e+02 8.20482788e+01 -7.05673141e+01
5.90085316e+00 6.11340332e+00 1.36884155e+02 8.19154358e+01
-7.05997162e+01 5.95664215e+00 6.15648460e+00 1.38069260e+02
8.18166504e+01 -7.06117554e+01 5.91953468e+00 6.19343090e+00
1.31883911e+02 8.22762680e+01 -7.08883972e+01 5.41670895e+00
5.95233154e+00 1.30659119e+02 8.23308563e+01 -7.08818970e+01
5.39485931e+00 5.81268215e+00 1.29453003e+02 8.23827820e+01
-7.08954697e+01 5.34367371e+00 5.71174240e+00 1.39727203e+02
8.28634186e+01 -2.53535576e+01 5.41604328e+00 6.27084827e+00
1.27449120e+02 8.36991653e+01 -2.58820782e+01 4.58715439e+00
5.48289108e+00 1.36352356e+02 8.82415848e+01 -5.97117538e+01
6.54122543e+00 6.59466648e+00 1.32080963e+02 8.85622940e+01
-5.99212036e+01 6.11606789e+00 6.23729324e+00 1.49781281e+02
1.00469231e+02 -3.86938047e+00 8.03562832e+00 7.07540989e+00
1.18148499e+02 1.01457352e+02 -1.44072628e+00 6.75756645e+00
5.90803051e+00 1.48102356e+02 1.22925110e+02 -2.01236191e+01
1.22893143e+00 5.59673119e+00 1.20533363e+02 1.26047218e+02
-8.98852444e+00 2.19238091e+00 4.95798111e+00 1.40159943e+02
1.37836594e+02 -8.92863007e+01 9.60869789e-01 6.51170492e+00
1.25583588e+02 1.41596588e+02 -6.00603294e+01 1.46546888e+00
5.68430853e+00 1.39452118e+02 1.43402100e+02 -1.08348984e+02
6.98071480e-01 6.35530806e+00 1.25737984e+02 1.47090820e+02
-7.66798401e+01 1.16211796e+00 5.82995224e+00 1.38041962e+02
1.42287186e+02 -1.17590317e+02 7.52276421e-01 6.42307949e+00
1.26497627e+02 1.46387238e+02 -8.50387878e+01 1.18911743e+00
5.77919245e+00 1.37357330e+02 1.40137482e+02 -9.51008835e+01
5.64127922e-01 6.50269699e+00 1.27454880e+02 1.44153549e+02
-6.49084854e+01 9.84773159e-01 5.81506538e+00 1.43007278e+02
1.51067993e+02 -3.25088501e+00 6.99400473e+00 6.90005779e+00
1.23827789e+02 1.50483246e+02 3.33819008e+00 6.56668234e+00
6.50400543e+00 1.45736176e+02 1.87921982e+02 -2.45862961e+01
4.38423443e+00 7.95130539e+00 1.20298485e+02 1.87304306e+02
-7.27436352e+00 4.73125172e+00 7.09363842e+00 1.46965393e+02
2.22381470e+02 4.40943909e+01 4.00260973e+00 7.13036728e+00
1.17406479e+02 2.21243652e+02 5.33527069e+01 4.27931213e+00
6.89513397e+00 1.45334076e+02 2.26461731e+02 4.82308006e+01
1.76353836e+00 6.79423189e+00 1.18358162e+02 2.25502304e+02
5.68017273e+01 1.63529110e+00 6.69629335e+00 1.47987625e+02
2.36739410e+02 -3.61620903e+00 3.60444260e+00 5.72853947e+00
1.17011459e+02 2.35260330e+02 5.88278961e+00 3.67541838e+00
5.58915186e+00 1.33429810e+02 1.50955917e+02 1.10433521e-02
-2.07624626e+01 2.00330315e+01 1.34158005e+02 5.72637711e+01
1.95749372e-01 -2.07550564e+01 2.00392818e+01 1.38469467e+02
1.42656693e+02 -3.06649655e-02 -2.06797619e+01 5.75013256e+00
1.40261719e+02 1.37813446e+02 1.39857873e-01 -2.06311188e+01
5.87664318e+00 1.26335205e+02 1.46619583e+02 7.93662369e-02
-2.06242943e+01 5.33506107e+00 1.25588226e+02 1.41638489e+02
3.51254493e-02 -2.06772423e+01 5.28287077e+00]]

I got this result when I took the generated in the deepstream the TensorRT engine, and I implemented python pipeline and run this model on the this engine but without deepstream.
So, the problem isn`t with the TensorRT, because the same model on the same engine works perfectly without deepstream.

This is how the model linked to the deepstream:

 GstElement *pgie = gst_element_factory_make("nvinfer", "nvinfer-blaze");
 if (!pgie) {
      g_printerr("ERROR: Failed to create nvinfer\n");
      return -1;
  }

 g_object_set(G_OBJECT(pgie), "config-file-path", CONFIG_INFER_POSE, "qos", 0, NULL);

 gst_bin_add_many(GST_BIN(pipeline), pgie, tracker, converter, osd, sink, NULL);
  if (!gst_element_link_many(streammux, pgie, tracker, converter, osd, sink, NULL)) {
    g_printerr("ERROR: Pipeline elements could not be linked\n");
    return -1;
  }

So, I think, the problem with the infer settings, or with something under the hood of deepstream, which leads to incorrect data preprocessing.

This is example of the preprocessing of the correct python pipeline:

def preprocess_frame(frame, input_size):
    frame_resized = cv2.resize(frame, input_size)
    frame_normalized = frame_resized.astype(np.float32) / 255.0
    frame_transposed = np.transpose(frame_normalized, [2, 0, 1])
    return np.expand_dims(frame_transposed, axis=0)

Maybe there is a problem with the infer settings? Or do I need to implement preprocessing?
Could your help me, to find where does the data distortion occur? At the input of the model or already at the output? And how to fix it?

Fiona.Chen · September 3, 2024, 10:03am

Only these two parameters can be confirmed to be aligned with

The other things need to be confirmed by yourself since they are all model related but not DeepStream related.

Fiona.Chen · September 3, 2024, 10:06am

Why do you enable this parameter?

roman.tolmachov · September 3, 2024, 10:14am

Thank you for your answer. I enable it, because I debug the meta data of the model

Fiona.Chen · September 3, 2024, 10:18am

We know nothing about the model. You may refer to DeepStream SDK FAQ - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums to check the nvinfer configurations.

roman.tolmachov · September 3, 2024, 12:49pm

Thank you! Do I need change the parameter network-type=3 to 100, for the BlazePose? Because, now it is working like a Yolo and I am emulating in the plugin processing of the bboxes. Maybe it is not good way and I need to work with the BlazePose like a custom model?

Fiona.Chen · September 4, 2024, 7:08am

Seems you want to output the bboxes and the object masks, it is correct to set “network-type=3”. You can refer to the instance segmentation custom postprocessing sample “NvDsInferParseCustomMrcnnTLT” in /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer_customparser/nvdsinfer_custombboxparser.cpp. The configuration can be found in /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-mrcnn-test/dsmrcnn_pgie_config.txt

roman.tolmachov · September 4, 2024, 7:53am

Not really. BlazePose model returns only the pose key points, without bboxes, so I need only the key points array. How to get it correctly in the deepstream? And is there such a possibility? Because I tried to implement the model via network-type=100 without processing the bboxes, and get the inference output values from the metadata and I got the same result of an array of incorrect values.

Fiona.Chen · September 5, 2024, 1:18am

You need to assign bbox to the object. If you don’t need bbox, just ignore bbox in the following plugins.

It is model related, you need to make sure the preprocessing and postprocessing is the same as your reference pipeline.

yingliu · September 30, 2024, 7:01am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

system · October 14, 2024, 7:01am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
I'm not able to set current parameter for headpose estimation model(roll,yaw,pitch) custom model DeepStream SDK	15	559	February 23, 2024
I want to run 「Pose Estimation with DeepStream」, but I don't understand the setup procedure DeepStream SDK	14	1092	February 17, 2023
Object detection pre-trained model inference issue in deepstream DeepStream SDK tensorrt , jetson-inference , gstreamer , python	51	1126	August 9, 2024
Deepstream_pose_estimation DeepStream SDK	15	1932	October 12, 2021
Deepstream pose estimation model: getting error while running DeepStream SDK	5	1178	October 12, 2021
Deepstream-pose-estimation-app --input file://$BODYPOSE3D_HOME/streams/bodypose.mp4: build engine file failed DeepStream SDK tensorrt	2	146	June 29, 2024
Deepstream-pose-classification-app is not working (deepstream 7.1) DeepStream SDK jetson , deepstream	19	244	September 25, 2025
Loss of precision to onnx converter for engine by deepstream 6.3 DeepStream SDK tensorrt , gstreamer , inference-server-triton	31	837	August 2, 2024
PGIE element could not be created. Exiting DeepStream SDK	23	791	October 24, 2023
Human Pose detection model - Isses with converted model output in Deepstream DeepStream SDK tensorrt , onnx , deepstream	14	1631	September 19, 2022

BlazePose integration problem to DeepStream 7.0

Related topics